<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This is a big rewrite of global config.
It does a few things, all somewhat intertwined:
1. centralize the list of configuration we have to a yaml file that can
be parsed, and code generated from it
2. add an initialization and a reset stage so that config vars can be
centrally accessed very quickly without the need for caching them
3. makes the syntax more C++ like (less macros!)
4. (optionally) adds absl flags to the OSS build
This first round of changes is intended to keep the system where it is
without major changes. We pick up absl flags to match internal code and
remove one point of deviation - but importantly continue to read from
the environment variables. In doing so we don't force absl flags on our
customers - it's possible to configure grpc without the flags - but
instead allow users that do use absl flags to configure grpc using that
mechanism. Importantly this lets internal customers configure grpc the
same everywhere.
Future changes along this path will be two-fold:
1. Move documentation generation into the code generation step, so that
within the source of truth yaml file we can find all documentation and
data about a configuration knob - eliminating the chance of forgetting
to document something in all the right places.
2. Provide fuzzing over configurations. Currently most config variables
get stashed in static constants across the codebase. To fuzz over these
we'd need a way to reset those cached values between fuzzing rounds,
something that is terrifically difficult right now, but with these
changes should simply be a reset on `ConfigVars`.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Alongside https://github.com/grpc/grpc/pull/32496, this makes this test
behave the same on all platforms.
FWIW, I verified this causes us to see the previous lock cycle problem
in https://github.com/grpc/grpc/pull/32491 on linux - originally that
lock cycle was only on mac, because of environmental differences between
mac and linux in CI.
* Reland x2: Make GetDefaultEventEngine return a shared_ptr
* remove thread leak from NativeDNSResolver
This is not going to work for resolvers that support cancellation.
* give resolvers bounded lifetimes
Some resolver own EventEngines. EventEngines cannot run off the end of
the process since they have unjoined threads (problematic in a small set
of environments). This gives resolvers bounded lifetimes, and allows
replacement of resolvers without ASAN issues of deleting resolvers in
active use (occurs in tests).
* fix
* fix windows
* fix surface init test
* fix
* sanitize
* use after move
* the test must wait for the callback to be destroyed
* windows fix: delete the resolver on iomgr shutdown, not before
* Make TimerManager threads non-joinable
On gRPC shutdown, any unjoined TimerManager threads will cause TSAN to
detect thread leaks. This fix resolves issues I saw in end2end test
shutdown in another PR, where a single timer manager thread was always
alive after the test ended.
The long-term solution is to integrate the new ThreadPool here, but this
unblocks me for now.
* backport fix
* fix
* shared_ptr<EventEngine> in EventEngine benchmarks
We have many tests that create 100 threads or more, and mounting evidence that
this is harmful to our CI environment.
When the original code for many of these tests was written we ran our tests
under run_tests, which had explicit handling for tracking the number of threads
each test needed and making sure that we weren't over subscribing the test
runner. Bazel has no such facility (and the facility in run_tests has since
been removed) and so we need to adjust.
This PR adjusts down a single test and is part of a series so that we can
review and roll back easily if required.
* Reland: "Make GetDefaultEventEngine return a shared_ptr (#30280)"
This reverts commit 45959e7cc1.
* Attempted fix with NoDestruct
* Not a process-wide singleton for the type. Just a NonDestruct
* fix
This works around valgrind memory leaks by giving EventEngines a fixed
lifetime. We eventually want ref-counted EventEngines internally, so this is
a step in the right direction as well.
* Rename the default EventEngine headers
Small cleanup. This code hasn't been related to factories for a month or
two.
* ensure only one target contains default_event_engine.h
* src + hdr in same target
* include guards
* Revert "Revert "Reland: Add SRV and TXT record lookup methods to the iomgr PAI (#30242)"
This reverts commit b5966f39eb.
* release lock before unreffing
* Revert "Reland: Add SRV and TXT record lookup methods to the iomgr API (#30206)"
This reverts commit c229703f9f.
* Automated change: Fix sanity tests
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
* Revert "Revert "Add SRV and TXT record lookup methods to the iomgr API (#30078)" (#30176)"
This reverts commit 2c3acbb2b2.
* one way to fix the ares handle race. Another option: work_serializer
* replace mu with parent's work serializer
* add lock annotations
* Revert "replace mu with parent's work serializer"
This reverts commit 0fce0ae150.
* statusor -> optional
* Automated change: Fix sanity tests
* add missing dep
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
* Rename ResolveName to LookupHostname (same as EventEngine)
* Add stubs and no-op impls for LookupTXT and LookupSRV
* add native resolver tests that assert unimplemented
* extract custom name_server-setting logic and remove goto
* Separate SRV queries from grpc_dns_lookup_ares
* add necessary fixits before merging
* Automated change: Fix sanity tests
* fix missing ExecCtx on resolver tests
* separate out TXT lookup from hostname lookup (now all 3 are separate)
* rm DNS and update docs
* fix the fixer (forgot to add deps to BUILD)
* remove unused SRV and TXT args from ares hostname lookup method
* rename hostname-only ares dns lookup method
* refactor AresRequest using template method pattern
* Add name_server to Ares LookupHostname internals (needs iomgr API change)
* fix resolver test, callback should not be called on cancellation
* implement Ares-iomgr SRV and TXT lookup methods (verified manually)
Used a custom bind server with some redacted tests from
`resolve_address_test` to ensure both are working as expected.
* cleanup cruft
* unify common ares request setup logic between A, AAAA, TXT, and SRV
* generate_projects
* comment out unused args
* DNSResolver iomgr API uses Duration; hostname has all args now
* rm stale TODOs
* windows fix - bad variable name
* windows fix
* Automated change: Fix sanity tests
* reviewer feedback
* make protected members private
* move common properties to AresRequest base class
* localhost TXT results are empty, not an error
* reviewer feedback
* fix
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
DNS requests were previously cancellable, but it was assumed
that the resolution callback would be called in all cases. Now, requests
provide a `bool Cancel(TaskHandle handle)` method that lets callers know
whether cancellation is successful, and if so, the callback will not be
run.
This is in accordance with EventEngine's cancellation semantics, and it
is a step towards migration from iomgr to EventEngine.
* Refactor end2end tests to exercise each EventEngine
* fix incorrect bazel_only exclusions
* Automated change: Fix sanity tests
* microbenchmark fix
* sanitize, fix iOS flub
* Automated change: Fix sanity tests
* iOS fix
* reviewer feedback
* first pass at excluding EventEngine test expansion
Also caught a few cases where we should not test pollers, but should
test all engines. And two cases where we likely shouldn't be testing
either product.
* end2end fuzzers to be fuzzed differently via EventEngine.
* sanitize
* reviewer feedback
* remove misleading comment
* reviewer feedback: comments
* EE test_init needs to play with our build system
* fix golden file test runner
Co-authored-by: drfloob <drfloob@users.noreply.github.com>