Adds a test for the experiments codegen. It updates the codegen to parse
test_experiments.yaml and test_experiments_rollouts.yaml files and
generate test_experiments.h and test_experiments.cc files along with an
experiments_test.cc file. The experiments test verifies the returned
value of IsExperimentEnabled with the expected value.
This PR does the following: for the TLS server credentials, stops
calling `SSL_CTX_set_client_CA_list` by default in
`ssl_transport_security.cc`, and gives users a knob to re-enable calling
this API.
## What does the `SSL_CTX_set_client_CA_list` API do?
When this API is called, a gRPC TLS server sends the following data in
the ServerHello: for each certificate in the server's trust bundle, the
CA name in the certificate.
This API does not change the set of certificates trusted by the server
in any way. Rather, it is just providing a hint to the client about what
client certificate should be sent to the server.
## Why are we removing the use of `SSL_CTX_set_client_CA_list` by
default for the TLS server credentials?
Removing the use of this API by default has 2 benefits:
1. Calling this API makes gRPC TLS unusable for servers with a
sufficiently large trust bundle. Indeed, if the server trust bundle is
too large, then the server will always fail to build the ServerHello.
2. Calling this API is introducing a huge amount of overhead (1000s of
bytes) to each ServerHello, so removing this feature will improve
connection establishment latency for all users of the TLS server
credentials.
More work on the dualstack backend design:
- Now that all petiole policies have been changed to delegate to
pick_first, outlier detection no longer needs to eject via the
subchannel's raw connectivity state; it can now eject only via the
health state. See #33340.
- This also removes the now-unnecessary hack to explicitly disable
outlier detection in pick_first. See #33336.
More work on the dualstack backend design:
- Change ring_hash policy to delegate to pick_first instead of creating
subchannels directly.
- Note that, as mentioned in the WIP gRFC, because we lazily create the
pick_first child policies, so there's no need to swap over to a new list
as an atomic whole. As a result, we don't use the endpoint_list library
in this policy; instead, we just update a map in-place.
- Remove now-unused subchannel_list library.
More work on the dualstack backend design:
- Change round_robin to delegate to pick_first instead of creating
subchannels directly.
- Change pick_first such that when it is the child of a petiole policy,
it will unconditionally start a health watch.
- Change the client-side health checking code such that if client-side
health checking is not enabled, it will return the subchannel's raw
connectivity state.
- As part of this, we introduce a new endpoint_list library to be used
by petiole policies, which is intended to replace the existing
subchannel_list library. The only policy that will still directly
interact with subchannels is pick_first, so the relevant parts of the
subchannel_list functionality have been copied directly into that
policy. The subchannel_list library will be removed after all petiole
policies are updated to delegate to pick_first.
Add bazel dependency on opentelemetry-cpp.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
In FuzzingDNSResolver, capturing the engine as raw pointers in the
lambda functions instead of capturing the `this` pointer. By the time
the lambda is ran, the FuzzingDNSResolver might already be destroyed but
the engine should still be alive.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
I'm fixing the ALTS/Envoy transport socket extension (which is currently
broken). Along the way, I'm trying to remove as many uses of gRPC
internals as possible (with the eventual goal of only relying on public
gRPC APIs and the alts_zero_copy_grpc_protector). To this end, I need to
remove the ExecCtx check in the alts_zero_copy_grpc_protector_create
function, so that Envoy can call into this function without needing to
create an ExecCtx.
The address attribute interface was intended to provide a mechanism to
pass attributes separately from channel args, for values that do not
affect subchannel behavior and therefore do not need to be present in
the subchannel key, which does include channel args. However, the
mechanism as currently designed is fairly clunky and is probably not the
direction we will want to go in the long term.
Eventually, we will want some mechanism for registering channel args,
which would provide a cleaner way to indicate that a given channel arg
should not be used in the subchannel key, so that we don't need a
completely different mechanism. For now, this PR is just doing an
interim step, which is to establish a special channel arg key prefix to
indicate that an arg is not needed in the subchannel key.
This change simplifies `EventEngine::DNSResolver`'s API based on the
proposal:
[go/event-engine-dns-resolver-api-changes](http://go/event-engine-dns-resolver-api-changes).
Note that this API change + the implementation described in
[go/event-engine-dns-resolver-implementation](http://go/event-engine-dns-resolver-implementation)
has already been tested against our main test suites and are passing
them.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This reverts commit e107ff5e99.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
In the HTTP(S) test server in the core tests, use
`ssl.SSLContext.wrap_socket`, not `ssl.wrap_socket`. The latter emits a
`DeprecationWarning` since Python 3.10 and is [removed in Python
3.12](https://github.com/python/cpython/issues/94199).
This fixes the core tests (but not necessarily the `grpcio` tests) for
Python 3.12.
This is relevant to https://github.com/grpc/grpc/issues/33063.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
In chttp2: a pending but not yet sent goaway should block incoming
requests just like a sent one (we will sent that data momentarily!)
In the test:
- handle the case of the connection idle timeout happening before the
request arrives at the server
- disable retries, as these cause the request to get stuck (as we don't
have an additional server to retry on)
Fix b/287897932
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Noticed some inconsistencies in our keepalive configuration -
* Earlier, even if keepalive pings were disabled, we would be scheduling
keepalive pings at an interval of INT_MAX ms.
* We were not using `g_default_client_keepalive_permit_without_calls` /
`g_default_server_keepalive_permit_without_calls`. They are both false
by default but they can be overridden in
`grpc_chttp2_config_default_keepalive_args`.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
We want writes to participate in event re-ordering, but it's unlikely
that we can sustain one byte per 500ms on all tests and keep them
passing (which is the degenerate case right now).
Tune write delays down to 50ms for the moment, though I expect we'll
want to talk about going lower.
omgwtfbbq
This test relies on WAIT_FOR_READY semantics, but we don't do that in
the proxy, so it got assigned the wrong suite.
Fix the suite, fix the flakes.
Also add some handy dandy logging to help figure this stuff out in the
future.
I can still make the old algorithm break and assign duplicate names on
my machine... make it a little more robust.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
I've had local runs with a 10 second gap between creating the call and
issuing the first batch client side.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>