More changes as part of the dualstack design:
- Change resolver and LB policy APIs to support multiple addresses per
endpoint. Specifically, replace `ServerAddress` with
`EndpointAddresses`, which encodes more than one address. Per-address
channel args are retained at the same level, so they are now
per-endpoint. For now, `EndpointAddress` provides a single-address ctor
and a single-address accessor for backward compatibility, so
`ServerAdress` is an alias for `EndpointAddresses`; eventually, this
alias and the single-address methods will be removed.
- Add an `EndpointAddressSet` class, which represents an unordered set
of addresses to be used as a map key. This will be used in a number of
LB policies that need to store per-endpoint state.
- Change the LB policy API's `ChannelControlHelper::CreateSubchannel()`
method to take the address and per-endpoint channel args as separate
parameters, so that we don't need to construct a legacy `ServerAddress`
object as we create a new subchannel for each address in the endpoint.
- Change pick_first to flatten the address list.
- Change ring_hash to use `EndpointAddressSet` as the key for its
endpoint map, and to use the first address of the endpoint as the hash
key.
- Change WRR to use `EndpointAddressSet` as the key for its endpoint
weight map.
Note that support for multiple addresses per endpoint is guarded in RR
by the existing `round_robin_delegate_to_pick_fist` experiment and in
WRR by the existing `wrr_delegate_to_pick_first` experiment.
This PR does *not* include support for multiple addresses per endpoint
for the outlier_detection or xds_override_host LB policies; those will
come in subsequent PRs.
Expand our fuzzing capabilities by allowing fuzzers to choose the bits
that go into random number distribution generators.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Lets us sever the dependency between stats & exec ctx (finally).
More work likely needs to go into the *mechanism* used here (I'm not a
fan of the per thread index), but that's also something we can address
later.
If the client calls LookupHostname again within the on_resolve callback,
it re-acquires the `request_mu_` before releasing it which results in
deadlock.
With this PR it extracts the request and releases the lock before
calling on_resolve callback so it won't deadlock any more.
Revert the reversion of the SSL_CTX_new change (#34355 reverted #34180 )
with a fix.
There was an issue with using `strcpy` on a `new[] string` in the
constructor of `ssl_credentials`. An ASAN test caught this in some CI
down the line - `ERROR: AddressSanitizer: alloc-dealloc-mismatch
(operator new [] vs free)`
That `strcpy` call was changed to `grp_strdup` which duplicates a string
in a way that can be freed by `gpr_free` and should resolve the ASAN
failure.
We have a bunch of experiments testing against core e2e - and this is
good for robustness, bad for CI times.
We also have a bunch of marginal but overall necessary fixtures in the
e2e suites - again good for robustness, bad for CI times.
We can eliminate some of the cross product though, and I think safely:
run experiments on a broad range of suites, but not *ALL* the suites,
and get a bunch of our CI time back.
Here I introduce an environment variable: `GRPC_CI_EXPERIMENTS` that's
set when running bazel @experiment= configs, cleared otherwise (so we
can still execute those tests directly when necessary). When that env
var is set we filter out a bunch of suites from the test configurations.
Add some basic metrics to work serializer, keep them process wide for
now (though it may be interesting to get these into channelz in the
future).
Collected are:
- time spent running a work serializer when it starts
- time spent actually executing work when the work serializer runs
- number of items executed each run
A high disparity between the first two indicates our dispatching
mechanism is adding large amounts of latency (perhaps due to thread
starvation like effects).
A high value for any of these indicate contention on the serializer.
It's likely a future iteration on these will select different metrics -
I'm not *entirely* sure which will be useful in production analysis yet.
I'm using `std::chrono::steady_clock` here for precision (nanoseconds)
with a compact representation (better than timespec) and a robust &
portable api - I think it's appropriate for metrics, but wouldn't use it
much beyond that at this point.
This has been stable for a bit, everywhere that the EventEngine is
enabled. Going forward, I think the event_engine_{client|listener}
experiments can probably be used to regulate thread-pool-specific
issues.
---------
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
Move the SSL_CTX to the level of the credentials rather than the
subchannel.
The SSL_CTX should only get created once per credential rather than once
per subchannel.
We should observe no behavior change with this PR, only efficiency
gains.
Most recent attempt was #34320, reverted in #34335.
The first commit here is a pure revert. The second commit fixes the
outlier_detection unit test to pass both with and without the
experiment.
To fix the following build error with the head of abseil
```
/var/local/git/grpc/test/core/tsi/ssl_transport_security_utils_test.cc:231:42: error: no member named 'StrCat' in namespace 'absl'
return absl::InternalError(absl::StrCat("Client error:", client_err));
~~~~~~^
/var/local/git/grpc/test/core/tsi/ssl_transport_security_utils_test.cc:238:42: error: no member named 'StrCat' in namespace 'absl'
return absl::InternalError(absl::StrCat("Server error:", server_err));
~~~~~~^
```
The previous approach of generating strings was not converging well.
Instead, load a bitfield from the protobuf and use the bits to select
experiments. The fuzzers can explore this space swiftly.
Downside is that as experiments rotate in/out the corpus gets a bit
messed up, but I'm reasonably confident we'll recover quickly.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Splitting off from https://github.com/grpc/grpc/pull/34273
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
In certain situations the current flow control algorithm can result in
sending one flow control update write for every write sent (known
situation: rollout of promise based server calls with qps_test).
Fix things up so that the updates are only sent when truly needed, and
then fix the fallout (turns out our fuzzer had some bugs)
I've placed actual logic changes behind an experiment so that it can be
incrementally & safely rolled out.
This should address one of the failures we're seeing in #34224.
The test failure is caused by the changes in timing triggering a race
condition. In the code at head, we delay sending out the subscription
for the first CDS watch until we've already seen the other two CDS
watches, because the previous send_message op has not yet completed, and
by the time it does, we've seen all 3 watches, so we can send a
subscription for all 3 at the same time. With the WorkSerializer change,
the send_message op is complete by the time we see the first CDS watch,
so we subscribe to only that resource, and then later add the other two.
The result is that we'll NACK twice with two different messages, the
first one including only the error about the first resource, and the
second one including all three.
I suspect this same race condition would have been triggered eventually
by the EventEngine migration anyway; the current test basically depends
on the single-thread timing of the iomgr approach. So I'm addressing it
by replacing the e2e test with a unit test that covers the same cases
without the timing issue.
Rolls forward part of the dualstack changes, mostly from #33427 and a
little bit from #32692, both of which were reverted in #33718.
Specifically:
- For petiole policies, unconditionally start health watch on
subchannels, even if client side health checking is not enabled; in this
case, the health watch will report the subchannel's raw connectivity
state.
- Fix edge cases in health check reporting that occur when a watcher is
started before the initial state is reported.
- When client-side health checking fails, add the subchannel's address
to the RPC failure status message.
- Outlier detection now works only via the health checking watch, not
via the raw connectivity state watch.
- Remove now-unnecessary hack to ensure that outlier detection does not
work for pick_first.
This rolls forward only the pick_first changes from #32692, which were
rolled back in #33718. Specifically:
- Changes PF to use its own subchannel list implementation instead of
using the subchannel_list library, since the latter will be going away
with the dualstack changes.
- As a result of no longer using the subchannel_list library, PF no
longer needs to set the `GRPC_ARG_INHIBIT_HEALTH_CHECKING` channel arg.
- Adds an option to start a health watch on the chosen subchannel, to be
used in the future when pick_first is the child of a petiole policy.
(Currently, this code is not actually called anywhere.)
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: Mark D. Roth <roth@google.com>
Co-authored-by: markdroth <markdroth@users.noreply.github.com>
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Pipe-like type (has a send end, a receive end, and a closing mechanism)
for cross-activity transfers.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
If we get a readable event on an fd and both the following happens:
- c-ares does *not* read all bytes off the fd
- c-ares removes the fd from the set ARES_GETSOCK_READABLE
... then we have a busy loop here, where we'd keep asking c-ares to
process an fd that it no longer cares about.
This is indirectly related to a change in this code one month ago:
https://github.com/grpc/grpc/pull/33942 - before that change, c-ares
would close the socket when it called
[handle_error](7f3262312f/src/lib/ares_process.c (L707))
and so `IsFdStillReadableLocked` would start returning `false`, causing
us to get away with [this
loop](f6a994229e/src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc (L371)).
Now, because `IsFdStillReadableLocked` will keep returning true (because
of our overridden `close` API), we'll loop forever.