Instead of fixing a target size for writes, try to adapt it a little to
observed bandwidth.
The initial algorithm tries to get large writes within 100-1000ms
maximum delay - this range probably wants to be tuned, but let's see.
The hope here is that on slow connections we can not back buffer so much
and so when we need to send a ping-ack it's possible without great
delay.
Just seeing data flowing in after a ping is enough to establish liveness
of a connection, and so we can limit keepalive timeouts to that. Ping
timeouts are necessary for protocol correctness, but may be stuck behind
other traffic, so give them a little more of a grace period.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Fix b/304114403
- adds a new experimental tracer useful for diagnosing ping timeout
failures in unit tests
- adds a pair of experimental tracers for fuzzing event engine
- fix the behavior of FuzzingEventEngine so that a RunAfter(0, ...) runs
in the same tick
- up the rate of sends (reduce the send delay) so we guarantee to be
able to send 200kb/sec in fuzzed e2e unit tests
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
The `TickForDuration()` method was using `grpc_core::Timestamp::Now()`
to get the current time, but that was not in sync with the `now_` value
inside the Fuzzing EE itself, with the result that after two subsequent
250ms increments, timers were not being properly fired. I've added a
test that demonstrates this failure without the fix.
Experiment 1: On RST_STREAM: reduce MAX_CONCURRENT_STREAMS for one round
trip.
Experiment 2: If a settings frame is outstanding with a lower
MAX_CONCURRENT_STREAMS than is configured, and we receive a new incoming
stream that would exceed the new cap, randomly reject it.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Previously chttp2 would allow infinite requests prior to a settings ack
- as the agreed upon limit for requests in that state is infinite.
Instead, after MAX_CONCURRENT_STREAMS requests have been attempted,
start blanket cancelling requests until the settings ack is received.
This can be done efficiently without allocating request state
structures.
This is the initial change of chaotic-good client transport read path,
which is a following PR of the client transport write path at #33876.
There's a pending work of handling endpoint failures in the transport.
It will be added after we have the inter-activity pipe with close
function.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
- Fixes support for the same address being present more than once in the
address list, which was accidentally broken in #34244.
- Change the call attribute to encode the hash as an integer instead of
a string.
This behavior is dangerous because we will crash when the cache is
created, which is not necessarily on application startup and is likely
when you first try to establish an SSL connection. Instead, we log an
error. If the SSL library attempts to put a session ticket in the cache
it will fail to do so, but everything else will continue as normal. In
particular, we will always seamlessly fall back to a full SSL handshake.
Along the way, we also ensure that you cannot put a null `SSL_SESSION`
into the cache, which would lead to a segfault when it is fetched from
the cache.
The previous hack had bitrotted and was just returning 3.7. This method
was given to us by the GCF team and has backward compatibility
guarantees.
This will also help us to ensure that we don't accidentally remove
support for a particular Python runtime version before GCF does.
Previously it turns out it was not safe to run grpc_init in a filter
test - we'd end up mixing event engine implementations, and causing
undefined behavior at grpc_shutdown.
This change makes it safe and fixes a test internally that's flaking at
70% right now (b/302986486).
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Fix chttp2 too_many_pings test to use only one of IPv4 or IPv6,
depending on test environment.
Also fix dumb reversed conditional bug in some other tests that was
accidentally introduced in #34426.
Looks like we've got a thread race on shutdown with some of these
tests... adding a barrier at the head of tests that require precise
transport counts in order to stabilize.
Isolate ping callback tracking to its own file.
Also takes the opportunity to simplify keepalive code by applying the
ping timeout to all pings.
Adds an experiment to allow multiple pings outstanding too (this was
originally an accidental behavior change of the work, but one that I
think may be useful going forward).
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>