* XdsBootstrap: move two more methods out of the interface
* Automated change: Fix sanity tests
* XdsClient: add unit test
* Automated change: Fix sanity tests
* fix memory leaks
* add helper method
* add unsubscription
* add test for multiple subscriptions
* clang-format
* fix build
* fix flakiness
* add checking for other node fields
* add v2 test
* add response builder
* add test for update from server
* add test for update containing only changed resources
* clang-format
* fix build
* add test for resource not existing upon subscription
* add test for stream closed by server
* add test for multiple watchers for the same resource
* add test for connection failure
* clang-format
* add test for resources wrapped in Resource wrapper message
* add test for resource validation failure
* add test for multiple invalid resources, and fix a case in XdsClient
* add test for validation failure for already-cached resource
* add test for server not resending resources after stream disconnect
* clang-format
* fix XdsClient to report channel errors to newly started watchers
* fix XdsClient to send cached errors/does-not-exists to newly started watchers
* fix watcher to ensure events arrive in the expected order
* fix tests
* clang-format
* add test for multiple resource types
* fix xds_cluster_e2e_test
* Automated change: Fix sanity tests
* cleanup
* add federation tests
* clang-format
* remove now-unnecessary XdsCertificateProviderPluginMapInterface
* code review comments
* simplify XdsResourceType::Decode() API
* XdsClient: add unit tests for XdsClusterResourceType
* add XdsClient with gRPC bootstrap config
* add LB policy tests
* started adding CertificateProvider tests
* update for recent API changes
* fix merge bugs
* xDS resource validation: identify extensions by type_url instead of name
* fix build
* migrate to ValidationErrors
* add xds_common_types_test
* finish TLS tests and add LRS tests
* move ScopedExperimentalEnvVar to its own library and remove redundant e2e tests
* add circuit breaking and outlier detection tests
* add validation to outlier detection LB policy parsing
* clang-format
* Automated change: Fix sanity tests
* fix signedness
* fix sanity
* fix sanity
* iwyu
* update code for XdsResourceTypeImpl changes
Co-authored-by: markdroth <markdroth@users.noreply.github.com>
* Reland x2: Make GetDefaultEventEngine return a shared_ptr
* remove thread leak from NativeDNSResolver
This is not going to work for resolvers that support cancellation.
* give resolvers bounded lifetimes
Some resolver own EventEngines. EventEngines cannot run off the end of
the process since they have unjoined threads (problematic in a small set
of environments). This gives resolvers bounded lifetimes, and allows
replacement of resolvers without ASAN issues of deleting resolvers in
active use (occurs in tests).
* fix
* fix windows
* fix surface init test
* fix
* sanitize
* use after move
* the test must wait for the callback to be destroyed
* windows fix: delete the resolver on iomgr shutdown, not before
* Make TimerManager threads non-joinable
On gRPC shutdown, any unjoined TimerManager threads will cause TSAN to
detect thread leaks. This fix resolves issues I saw in end2end test
shutdown in another PR, where a single timer manager thread was always
alive after the test ended.
The long-term solution is to integrate the new ThreadPool here, but this
unblocks me for now.
* backport fix
* fix
* shared_ptr<EventEngine> in EventEngine benchmarks
* WorkQueue
* weaken the large obj stress test for Windows; documentation
* update comment
* Add WorkQueue microbenchmark. Results below ...
------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
------------------------------------------------------------------------------------------
BM_WorkQueueIntptrPopFront/1 297 ns 297 ns 2343500 items_per_second=3.3679M/s
BM_WorkQueueIntptrPopFront/8 7022 ns 7020 ns 99356 items_per_second=1.13956M/s
BM_WorkQueueIntptrPopFront/64 59606 ns 59590 ns 11770 items_per_second=1074k/s
BM_WorkQueueIntptrPopFront/512 477867 ns 477748 ns 1469 items_per_second=1071.7k/s
BM_WorkQueueIntptrPopFront/4096 3815786 ns 3814925 ns 184 items_per_second=1073.68k/s
I0902 19:05:22.138022069 12 test_config.cc:194] TestEnvironment ends
================================================================================
* use int64_t for times. 0 performance change
------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
------------------------------------------------------------------------------------------
BM_WorkQueueIntptrPopFront/1 277 ns 277 ns 2450292 items_per_second=3.60967M/s
BM_WorkQueueIntptrPopFront/8 6718 ns 6716 ns 105497 items_per_second=1.19126M/s
BM_WorkQueueIntptrPopFront/64 56428 ns 56401 ns 12268 items_per_second=1.13474M/s
BM_WorkQueueIntptrPopFront/512 458953 ns 458817 ns 1550 items_per_second=1.11591M/s
BM_WorkQueueIntptrPopFront/4096 3686357 ns 3685120 ns 191 items_per_second=1.1115M/s
I0902 19:25:31.549382949 12 test_config.cc:194] TestEnvironment ends
================================================================================
* add PopBack tests: same performance profile exactly
* use Mutex instead of Spinlock
It's safer, and so far equally performant in benchmarks of opt builds
* add deque test for comparison. It is faster on all tests.
* Add sparsely-populated multi-threaded benchmarks.
* fix
* fix
* refactor to help thread safety analysis
* Specialize WorkQueue for Closure*s and AnyInvocables
* remove unused callback storage
* add single-threaded benchmark for closure vs invocable
* sanitize
* missing include
* move bm_work_queue to microbenchmarks so it isn't exported
* s/workqueue/work_queue/g
* use nullptr instead of optionals for popped closures
* reviewer test suggestion
* private things are private
* add a work_queue fuzzer
Ran for 10 minutes @ 42 jobs @ 42 workers. Zero failures.
Checked in a selection of 100 good seeds after merging the thousands of
results.
* fix
* fix header guards
* nuke the corpora
* feedback
* sanitize
* Timestamp::Now
* fix
* fuzzers do not work on windows
* windows does not like multithreaded benchmark tests
* Revert "Revert "[event_engine] Thread pool that can handle deletion in a callback (#30763)" (#30972)"
This reverts commit ccc787a020.
* Update thread_pool.cc
* Revert "Revert "XdsClient: add unit test and fix watcher notification bugs (#30823)" (#30942)"
This reverts commit 6d2c4a8314.
* use GRPC_CUSTOM_JSONUTIL macro for JsonPrintOptions
This adds a unit test for XdsClient and fixes several watcher-notification bugs found in the process. Specifically:
- When an ADS stream fails or an xDS channel reports a connectivity failure, report an error only to the watchers for resources being subscribed to on that particular channel, not to watchers on other channels.
- Cache the error status for the channel, so that if a new watcher is started after the channel reports the error, we can immediately report that error to the new watcher.
- If a resource is NACKed and has not been previously cached, or does not exist, report that fact to any new watcher that may be started later.
- If a resource in an ADS response is unparseable but is wrapped in a `Resource` wrapper, we do know its name, so record the validation failure in the cache and report it to the watchers.
Co-authored-by: markdroth <markdroth@users.noreply.github.com>
* client_channel: rewrite illegal status codes from control plane
* rewrite illegal status codes for call creds
* move fail_lb policy out of retry_lb_fail test so it can be reused
* test resolver and LB policy status rewrites
* add test for ConfigSelector status rewriting
* attempt to add client_auth filter unit test
* fix client_auth_filter test
* cleanup test
* fix build
* fix some memory leaks
* Automated change: Fix sanity tests
* Update client_auth_filter_test.cc
* fix build
* code review comments
* clang-tidy
Co-authored-by: markdroth <markdroth@users.noreply.github.com>
Co-authored-by: Craig Tiller <ctiller@google.com>
* [cleanup] Remove profiling timers
- nobody has used this system in years
- if we needed it, we'd probably rewrite it at this point to be something more modern
- let's remove it until that need arises
* fix
* fixes
* Disable end2end_binder_transport_test on some platforms
The following test case is flaky on windows
End2EndBinderTransportTestWithDifferentDelayTimes/End2EndBinderTransportTest.UnaryCallServerTimeout/1,
where GetParam() = 10ns
Binder transport won't be run on platform other than Android so it
should be OK to disable the test on some platform.
* Regenerate projects.
* Reland: "Make GetDefaultEventEngine return a shared_ptr (#30280)"
This reverts commit 45959e7cc1.
* Attempted fix with NoDestruct
* Not a process-wide singleton for the type. Just a NonDestruct
* fix
This works around valgrind memory leaks by giving EventEngines a fixed
lifetime. We eventually want ref-counted EventEngines internally, so this is
a step in the right direction as well.
A (currently) pthread_atfork-based fork support mechanism, allowing EventEngines - or any other object that wants to implement the Forkable interface - respond to forks.