Previously, metadata mutations were made by the picker directly, which meant that they would be applied even if the channel winds up discarding the pick due to the returned subchannel having been disconnected by the time the pick result is returned. This changes the API such that pickers return metadata mutations along with the pick result, so that the mutations won't get applied unless the pick result is actually used.
Closes#36968
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/36968 from markdroth:lb_metadata_api 2765da6121
PiperOrigin-RevId: 645451869
[Gpr_To_Absl_Logging] Move function to test header form log.h
This is not really needed in log.h
Closes#36860
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/36860 from tanvi-jagtap:move_function_to_test_header e6494bd06f
PiperOrigin-RevId: 642080756
[grpc][Gpr_To_Absl_Logging] Migrating from gpr to absl logging - BUILD
In this CL we are just editing the build and bzl files to add dependencies.
This is done to prevent merge conflict and constantly having to re-make the make files using generate_projects.sh for each set of changes.
Closes#36606
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/36606 from tanvi-jagtap:build_test_core_tsi_and_misc 708a724c46
PiperOrigin-RevId: 633518709
[grpc][Gpr_To_Absl_Logging] Migrating from gpr to absl logging GPR_ASSERT
Replacing GPR_ASSERT with absl CHECK
These changes have been made using string replacement and regex.
Will not be replacing all instances of CHECK with CHECK_EQ , CHECK_NE etc because there are too many callsites. Only ones which are doable using very simple regex with least chance of failure will be replaced.
Given that we have 5000+ instances of GPR_ASSERT to edit, Doing it manually is too much work for both the author and reviewer.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#36437
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/36437 from tanvi-jagtap:tjagtap_core_util_01 719e3e20e1
PiperOrigin-RevId: 628043154
This breaks the following pieces out of the `grpc_client_channel` BUILD target:
- backend_metric_parser
- oob_backend_metric
- child_policy_handler
- backup_poller
- service_config_channel_arg_filter
- client_channel_channelz
- client_channel_internal_header
- subchannel_connector
- subchannel_pool_interface
- config_selector
- client_channel_service_config_parser
- retry_service_config_parser
- retry_throttle
The code left in the `grpc_client_channel` target will need more work to pull apart.
Closes#35879
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35879 from markdroth:client_channel_build_split f388a37edc
PiperOrigin-RevId: 608806548
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#35210
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35210 from yijiem:csm-service-label 6a6a7d1774
PiperOrigin-RevId: 597641393
More changes as part of the dualstack design:
- Change resolver and LB policy APIs to support multiple addresses per
endpoint. Specifically, replace `ServerAddress` with
`EndpointAddresses`, which encodes more than one address. Per-address
channel args are retained at the same level, so they are now
per-endpoint. For now, `EndpointAddress` provides a single-address ctor
and a single-address accessor for backward compatibility, so
`ServerAdress` is an alias for `EndpointAddresses`; eventually, this
alias and the single-address methods will be removed.
- Add an `EndpointAddressSet` class, which represents an unordered set
of addresses to be used as a map key. This will be used in a number of
LB policies that need to store per-endpoint state.
- Change the LB policy API's `ChannelControlHelper::CreateSubchannel()`
method to take the address and per-endpoint channel args as separate
parameters, so that we don't need to construct a legacy `ServerAddress`
object as we create a new subchannel for each address in the endpoint.
- Change pick_first to flatten the address list.
- Change ring_hash to use `EndpointAddressSet` as the key for its
endpoint map, and to use the first address of the endpoint as the hash
key.
- Change WRR to use `EndpointAddressSet` as the key for its endpoint
weight map.
Note that support for multiple addresses per endpoint is guarded in RR
by the existing `round_robin_delegate_to_pick_fist` experiment and in
WRR by the existing `wrr_delegate_to_pick_first` experiment.
This PR does *not* include support for multiple addresses per endpoint
for the outlier_detection or xds_override_host LB policies; those will
come in subsequent PRs.
Expand our fuzzing capabilities by allowing fuzzers to choose the bits
that go into random number distribution generators.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Rolls forward https://github.com/grpc/grpc/pull/33871
Second and third commits here fix internal build issues
In particular, add a `// IWYU pragma: no_include <ares_build.h>` since
`ares.h` [includes that
anyways](bad62225b7/include/ares.h (L23))
(and seems unlikely for that to change since it would be breaking)
Normally, c-ares related fds are destroyed after all DNS resolution is
finished in [this code
path](c82d31677a/src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc (L210)).
Also there are some fds that c-ares may fail to open or write to
initially, and c-ares will close them internally before grpc ever knows
about them.
But if:
1) c-ares opens a socket and successfully writes a request on it
2) then a subsequent read fails
Then c-ares will close the fd in [this code
path](bad62225b7/src/lib/ares_process.c (L740)),
but gRPC will have a reference on the fd and will still use it
afterwards.
Fix here is to leverage the c-ares socket-override API to properly track
fd ownership between c-ares and grpc.
Related: internal issue b/292203138
I've got a hypothesis that we're losing isolation between test shards
right now for "some reason".
This is a change to reflect test sharding in the port distribution that
we use, in an attempt to alleviate that.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
I generated a new client key and cert where a Spiffe ID is added as the
URI SAN. As such, we are able to test the audit log contains the
principal correctly.
Update: I switched to use the test logger to verify the log content and
removed stdout logger here because one the failure of [RBE Windows Debug
C/C++](https://source.cloud.google.com/results/invocations/c3187f41-bb1f-44b3-b2b1-23f38e47386d).
Update again: Refactored the test logger in a util such that the authz
engine test also uses the same logger. Subsequently, xDS e2e test will
also use it.
---------
Co-authored-by: rockspore <rockspore@users.noreply.github.com>
ChannelArgs fuzz configuration is expected to be used in other fuzzing
targets as well. This PR extracts the common code from the API fuzzer
and converts to use the C++ types.
Add a new binary that runs all core end2end tests in fuzzing mode.
In this mode FuzzingEventEngine is substituted for the default event
engine. This means that time is simulated, as is IO. The FEE gets
control of callback delays also.
In our tests the `Step()` function becomes, instead of a single call to
`completion_queue_next`, a series of calls to that function and
`FuzzingEventEngine::Tick`, driving forward the event loop until
progress can be made.
PR guide:
---
**New binaries**
`core_end2end_test_fuzzer` - the new fuzzer itself
`seed_end2end_corpus` - a tool that produces an interesting seed corpus
**Config changes for safe fuzzing**
The implementation tries to use the config fuzzing work we've previously
deployed in api_fuzzer to fuzz across experiments. Since some
experiments are far too experimental to be safe in such fuzzing (and
this will always be the case):
- a new flag is added to experiments to opt-out of this fuzzing
- a new hook is added to the config system to allow variables to
re-write their inputs before setting them during the fuzz
**Event manager/IO changes**
Changes are made to the event engine shims so that tcp_server_posix can
run with a non-FD carrying EventEngine. These are in my mind a bit
clunky, but they work and they're in code that we expect to delete in
the medium term, so I think overall the approach is good.
**Changes to time**
A small tweak is made to fix a bug initializing time for fuzzers in
time.cc - we were previously failing to initialize
`g_process_epoch_cycles`
**Changes to `Crash`**
A version that prints to stdio is added so that we can reliably print a
crash from the fuzzer.
**Changes to CqVerifier**
Hooks are added to allow the top level loop to hook the verification
functions with a function that steps time between CQ polls.
**Changes to end2end fixtures**
State machinery moves from the fixture to the test infra, to keep the
customizations for fuzzing or not in one place. This means that fixtures
are now just client/server factories, which is overall nice.
It did necessitate moving some bespoke machinery into
h2_ssl_cert_test.cc - this file is beginning to be problematic in
borrowing parts but not all of the e2e test machinery. Some future PR
needs to solve this.
A cq arg is added to the Make functions since the cq is now owned by the
test and not the fixture.
**Changes to test registration**
`TEST_P` is replaced by `CORE_END2END_TEST` and our own test registry is
used as a first depot for test information.
The gtest version of these tests: queries that registry to manually
register tests with gtest. This ultimately changes the name of our tests
again (I think for the last time) - the new names are shorter and more
readable, so I don't count this as a regression.
The fuzzer version of these tests: constructs a database of fuzzable
tests that it can consult to look up a particular suite/test/config
combination specified by the fuzzer to fuzz against. This gives us a
single fuzzer that can test all 3k-ish fuzzing ready tests and cross
polinate configuration between them.
**Changes to test config**
The zero size registry stuff was causing some problems with the event
engine feature macros, so instead I've removed those and used GTEST_SKIP
in the problematic tests. I think that's the approach we move towards in
the future.
**Which tests are included**
Configs that are compatible - those that do not do fd manipulation
directly (these are incompatible with FuzzingEventEngine), and those
that do not join threads on their shutdown path (as these are
incompatible with our cq wait methodology). Each we can talk about in
the future - fd manipulation would be a significant expansion of
FuzzingEventEngine, and is probably not worth it, however many uses of
background threads now should probably evolve to be EventEngine::Run
calls in the future, and then would be trivially enabled in the fuzzers.
Some tests currently fail in the fuzzing environment, a
`SKIP_IF_FUZZING` macro is used for these few to disable them if in the
fuzzing environment. We'll burn these down in the future.
**Changes to fuzzing_event_engine**
Changes are made to time: an exponential sweep forward is used now -
this catches small time precision things early, but makes decade long
timers (we have them) able to be used right now. In the future we'll
just skip time forward to the next scheduled timer, but that approach
doesn't yet work due to legacy timer system interactions.
Changes to port assignment: we ensure that ports are legal numbers
before assigning them via `grpc_pick_port_or_die`.
A race condition between time checking and io is fixed.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Add the capability for api-fuzzer to fuzz over different config
variables, to enable us to spot incompatible configurations there
sooner.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Implement listeners, connection, endpoints for `FuzzingEventEngine`.
Allows the fuzzer to select write sizes and delays, connection delays,
and port assignments.
I made a few modifications to the test suite to admit this event engine
to pass the client & server tests:
1. the test factories return shared_ptr<> to admit us to return the same
event engine for both the oracle and the implementation - necessary
because FuzzingEventEngine forms a closed world of addresses & ports.
2. removed the WaitForSingleOwner calls - these seem unnecessary, and we
don't ask our users to do this - tested existing linux tests 1000x
across debug, asan, tsan with this change
Additionally, the event engine overrides the global port picker logic so
that port assignments are made by the fuzzer too.
This PR is a step along a longer journey, and has some outstanding
brethren PR's, and some follow-up work:
* #32603 will convert all the core e2e tests into a more malleable form
* we'll then use #32667 to turn all of these into fuzzers
* finally we'll integrate this into that work and turn all core e2e
tests into fuzzers over timer & callback reorderings and io
size/spacings
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
* rename task-handle mutex
* rename TimerClosure
* tmp commit, building, not tested
* add test for client connection to a non-listening port
* fix posix EE tests
* re-fix windows test suite after posix compatibility
* (unfinished, backup): passing the suite's NonExistingListener client test
* remove redundant windows client test
* integrate IOCP worker loop
* initial commit of echo test tool; fixes
* move echo client to test_suite/tools; I do not yet like the setup, it's about time for a macro that generates all useful test/tool targets
* cleanup
* add --target arg to echo_client. requires URI
* Automated change: Fix sanity tests
* build fixes
* fix
* fix
* reviewer feedback
* warning fix
* delete logic on cancellation
* fix
* cancel connect deadlock; improved template code
* fix build failure with linux environments building windows targets
* fix
* fix
* no ++ for atomics
* remove the test changes, to be landed separately
* Automated change: Fix sanity tests
* remnants
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
* XdsBootstrap: move two more methods out of the interface
* Automated change: Fix sanity tests
* XdsClient: add unit test
* Automated change: Fix sanity tests
* fix memory leaks
* add helper method
* add unsubscription
* add test for multiple subscriptions
* clang-format
* fix build
* fix flakiness
* add checking for other node fields
* add v2 test
* add response builder
* add test for update from server
* add test for update containing only changed resources
* clang-format
* fix build
* add test for resource not existing upon subscription
* add test for stream closed by server
* add test for multiple watchers for the same resource
* add test for connection failure
* clang-format
* add test for resources wrapped in Resource wrapper message
* add test for resource validation failure
* add test for multiple invalid resources, and fix a case in XdsClient
* add test for validation failure for already-cached resource
* add test for server not resending resources after stream disconnect
* clang-format
* fix XdsClient to report channel errors to newly started watchers
* fix XdsClient to send cached errors/does-not-exists to newly started watchers
* fix watcher to ensure events arrive in the expected order
* fix tests
* clang-format
* add test for multiple resource types
* fix xds_cluster_e2e_test
* Automated change: Fix sanity tests
* cleanup
* add federation tests
* clang-format
* remove now-unnecessary XdsCertificateProviderPluginMapInterface
* code review comments
* simplify XdsResourceType::Decode() API
* XdsClient: add unit tests for XdsClusterResourceType
* add XdsClient with gRPC bootstrap config
* add LB policy tests
* started adding CertificateProvider tests
* update for recent API changes
* fix merge bugs
* xDS resource validation: identify extensions by type_url instead of name
* fix build
* migrate to ValidationErrors
* add xds_common_types_test
* finish TLS tests and add LRS tests
* move ScopedExperimentalEnvVar to its own library and remove redundant e2e tests
* add circuit breaking and outlier detection tests
* add validation to outlier detection LB policy parsing
* clang-format
* Automated change: Fix sanity tests
* fix signedness
* fix sanity
* fix sanity
* iwyu
* update code for XdsResourceTypeImpl changes
Co-authored-by: markdroth <markdroth@users.noreply.github.com>
* A modest split of `:gpr` for mpscq support
A full split of the `gpr` target into 25 separate targets is building,
but there are some hurdles to get over with respect to ODR violations
and public API support for the gpr library.
This PR splits off a small chunk of that work.
* Automated change: Fix sanity tests
Co-authored-by: drfloob <drfloob@users.noreply.github.com>