`ProtoBitGen` provides a random number generator that returns values directly from fuzzer selected values, which allows us to test-into random selection algorithms deterministically.
Since the list of values provided by the fuzzer is limited, we need a fallback implementation. Previously we'd used something that was very correlated, and some of the distribution algorithms get into a very slow convergence mode when we do that (so we repeatedly return the same value for billions of iterations and cause timeouts in fuzzers).
Instead, when we run out of fuzzer supplied values, seed an mt19937 generator with the fuzzer selected values and use that from there on. Said generator will then produce values deterministically (for a given fuzzer input), but with a better distribution to allow convergence for fiddly algorithms.
Closes#35621
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35621 from ctiller:cg-timeout 6c9ef9cac5
PiperOrigin-RevId: 600607424
Only call constructors when absolutely necessary (empty trivially constructible types don't need construction!!)
Similarly for destructors, if the destructor is trivial it means C++ will do no work destructing it... let's not even do the virtual function call to get there.
(also fix a bug where we weren't calling this stuff anyway, and add a test that would have caught that)
Closes#35591
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35591 from ctiller:filter-min 2933152d61
PiperOrigin-RevId: 599521371
A call execution environment for the V3 runtime.
The `CallFilters` class will ultimately be a (private) member of `CallSpine`, and the `StackBuilder` component will be used by a channel when all of the filters it needs are known to allow the call spine to start processing a call.
This is accompanied by a reasonably extensive test suite.
I expect to fine tune semantics, implementation, and tests over the coming weeks/months as we iterate to bring up the rest of the pieces.
Closes#35533
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35533 from ctiller:filters 689c7b527b
PiperOrigin-RevId: 599220150
We were getting errors due to insane amounts of padding: enforce limits, fix b/319533934.
Closes#35537
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35537 from ctiller:fff 9f5f31ef27
PiperOrigin-RevId: 597899598
- `memory_pressure_controller` finally - allows deletion of pid_controller throughout the codebase
- `overload_protection` - one of the http2 rapid reset mitigations
- `red_max_concurrent_streams` - another http2 rapid reset mitigation
Closes#35426
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35426 from ctiller:new-years-cleanse 4651672e7e
PiperOrigin-RevId: 595205029
Whilst here, eliminate unnecessary mutexes and streamline some complexity in the read variants.
Closes#35409
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35409 from ctiller:pbe 4f9588101a
PiperOrigin-RevId: 595006455
This fixes#21619. This experimental ALPN protocol has already been removed from the other gRPC stacks.
Closes#34876
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/34876 from matthewstevenson88:remove-grpc-exp 1cb9d084ea
PiperOrigin-RevId: 592080195
Built on #35278, which should be landed first
Always fail parsing when `grpclb_client_stats` is included in headers -- it's a meaningless value and the only reason to include it would be some sort of attack.
Closes#35279
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35279 from ctiller:fuzz-309756937 545448c4de
PiperOrigin-RevId: 590745978
Also cleanup a little so we're not copying redundant frame headers everywhere.
Closes#35278
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35278 from ctiller:fuzz-309716763 52589ff422
PiperOrigin-RevId: 590042072
Will be used during this transition time to run 5-pipe style filters somewhat more natively. Once everything is getting closer to 5-pipes, we'll drop this method and have the channel stack understand how to create an interception-map that can be reused per-call, instead of creating the interception-map every time a call is created.
Closes#35200
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35200 from ctiller:cg-channel-filter-api 2fc11dd273
PiperOrigin-RevId: 587940947
This is a follow-up PR of #34191, which handles the error condition of
endpoints failed to write/read in chaotic-good client transport.
This PR needs to be merged after #34191.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Changes to fake resolver:
- Add `WaitForReresolutionRequest()` method to fake resolver response
generator to allow tests to tell when re-resolution has been requested.
- Change fake resolver response generator API to have only one mechanism
for injecting results, regardless of whether the result is an error or
whether it's triggered by a re-resolution.
Changes to grpclb_end2end_test:
- Change balancer interface such that instead of setting a list of
responses with fixed delays, the test can control exactly when each
response is set.
- Change balancer impl to always send the initial LB response, as
expected by the grpclb protocol.
- Change balancer impl to always read load reports, even if load
reporting is not expected to be enabled. (The latter case will still
cause the test to fail.) Reads are done in a different thread than
writes.
- Allow each test to directly control how many backends and balancers
are started and the client load reporting interval, so that (a) we don't
waste resources starting servers we don't need and (b) there is no need
to arbitrarily split tests across different test classes.
- Add timeouts to `WaitForAllBackends()` functionality, so that tests
will fail with a useful error rather than timing out.
- Improved ergonomics of various helper functions in the test framework.
In the process of making these changes, I found a couple of bugs:
- A bug in pick_first, which I fixed in #34885.
- A bug in grpclb, in which we were using the wrong condition to decide
whether to propagate a re-resolution request from the child policy,
which I've fixed in this PR. (This bug probably originated way back in
#18344.)
This should address a lot of the flakes seen in grpclb_e2e_test
recently.
Roll forward #34657, which was reverted in #34761.
Previous error in CMake:
```
[ RUN ] ClientTransportTest.AddOneStreamMultipleMessages
unknown file: Failure
Unexpected mock function call - returning directly.
Function call: Call(CANCELLED: )
Google Mock tried the following 1 expectation, but it didn't match:
/[var/local/git/grpc/test/core/transport/chaotic_good/client_transport_test.cc:484](https://cs.corp.google.com/piper///depot/google3/var/local/git/grpc/test/core/transport/chaotic_good/client_transport_test.cc?l=484): EXPECT_CALL(on_done, Call(absl::OkStatus()))...
Expected arg #0: is equal to OK
Actual: CANCELLED:
Expected: to be called once
Actual: never called - unsatisfied and active
/[var/local/git/grpc/test/core/transport/chaotic_good/client_transport_test.cc:484](https://cs.corp.google.com/piper///depot/google3/var/local/git/grpc/test/core/transport/chaotic_good/client_transport_test.cc?l=484): Failure
Actual function call count doesn't match EXPECT_CALL(on_done, Call(absl::OkStatus()))...
Expected: to be called once
Actual: never called - unsatisfied and active
real 0.24
user 0.00
sys 0.00
2023-10-20 01:50:32,776 FAILED: cmake/build/client_transport_test --gtest_filter=ClientTransportTest.AddOneStreamMultipleMessages GRPC_POLL_STRATEGY=epoll1 [ret=139, pid=1663532, time=0.3sec]
```
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Roll forward #34191, which is reverted due to error `2023-10-09
22:01:18,569 FAILED: cmake/build/client_transport_test
--gtest_filter=ClientTransportTest.AddMultipleStreams
GRPC_POLL_STRATEGY=none` (Removed uses_event_engine=False,
uses_polling=False in test build).
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This PR aims to fix the issue found in frame_fuzzer after #34191 was
merged, where frame serialization is missing the frame header info and
causes mismatch with the original frame.
This PR needs to be merged before #34657.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Ditch the old priority scheme for ordering filters, instead explicitly
mark up before/after constraints.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Instead of fixing a target size for writes, try to adapt it a little to
observed bandwidth.
The initial algorithm tries to get large writes within 100-1000ms
maximum delay - this range probably wants to be tuned, but let's see.
The hope here is that on slow connections we can not back buffer so much
and so when we need to send a ping-ack it's possible without great
delay.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Fix b/304114403
- adds a new experimental tracer useful for diagnosing ping timeout
failures in unit tests
- adds a pair of experimental tracers for fuzzing event engine
- fix the behavior of FuzzingEventEngine so that a RunAfter(0, ...) runs
in the same tick
- up the rate of sends (reduce the send delay) so we guarantee to be
able to send 200kb/sec in fuzzed e2e unit tests
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Experiment 1: On RST_STREAM: reduce MAX_CONCURRENT_STREAMS for one round
trip.
Experiment 2: If a settings frame is outstanding with a lower
MAX_CONCURRENT_STREAMS than is configured, and we receive a new incoming
stream that would exceed the new cap, randomly reject it.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This is the initial change of chaotic-good client transport read path,
which is a following PR of the client transport write path at #33876.
There's a pending work of handling endpoint failures in the transport.
It will be added after we have the inter-activity pipe with close
function.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Fix chttp2 too_many_pings test to use only one of IPv4 or IPv6,
depending on test environment.
Also fix dumb reversed conditional bug in some other tests that was
accidentally introduced in #34426.
Looks like we've got a thread race on shutdown with some of these
tests... adding a barrier at the head of tests that require precise
transport counts in order to stabilize.
Isolate ping callback tracking to its own file.
Also takes the opportunity to simplify keepalive code by applying the
ping timeout to all pings.
Adds an experiment to allow multiple pings outstanding too (this was
originally an accidental behavior change of the work, but one that I
think may be useful going forward).
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
More changes as part of the dualstack design:
- Change resolver and LB policy APIs to support multiple addresses per
endpoint. Specifically, replace `ServerAddress` with
`EndpointAddresses`, which encodes more than one address. Per-address
channel args are retained at the same level, so they are now
per-endpoint. For now, `EndpointAddress` provides a single-address ctor
and a single-address accessor for backward compatibility, so
`ServerAdress` is an alias for `EndpointAddresses`; eventually, this
alias and the single-address methods will be removed.
- Add an `EndpointAddressSet` class, which represents an unordered set
of addresses to be used as a map key. This will be used in a number of
LB policies that need to store per-endpoint state.
- Change the LB policy API's `ChannelControlHelper::CreateSubchannel()`
method to take the address and per-endpoint channel args as separate
parameters, so that we don't need to construct a legacy `ServerAddress`
object as we create a new subchannel for each address in the endpoint.
- Change pick_first to flatten the address list.
- Change ring_hash to use `EndpointAddressSet` as the key for its
endpoint map, and to use the first address of the endpoint as the hash
key.
- Change WRR to use `EndpointAddressSet` as the key for its endpoint
weight map.
Note that support for multiple addresses per endpoint is guarded in RR
by the existing `round_robin_delegate_to_pick_fist` experiment and in
WRR by the existing `wrr_delegate_to_pick_first` experiment.
This PR does *not* include support for multiple addresses per endpoint
for the outlier_detection or xds_override_host LB policies; those will
come in subsequent PRs.
Expand our fuzzing capabilities by allowing fuzzers to choose the bits
that go into random number distribution generators.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
In certain situations the current flow control algorithm can result in
sending one flow control update write for every write sent (known
situation: rollout of promise based server calls with qps_test).
Fix things up so that the updates are only sent when truly needed, and
then fix the fallout (turns out our fuzzer had some bugs)
I've placed actual logic changes behind an experiment so that it can be
incrementally & safely rolled out.
Building out a new framing layer for chttp2.
The central idea here is to have the framing layer be solely responsible
for serialization of frames, and their deserialization - the framing
layer can reject frames that have invalid syntax - but the enacting of
what that frame means is left to a higher layer.
This class will become foundational for the promise conversion of chttp2
- by eliminating action from the parsing of frames we can reuse this
sensitive code.
Right now the new layer is inactive - there's a test that exercises it
relatively well, and not much more. In the next PRs I'll add an
experiments to enable using this layer or the existing code in the
writing and reading paths.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This is the initial implementation of the chaotic-good client transport
write path. There will be a follow-up PR to fulfill the read path.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Our current implementation of Join, TryJoin leverage some complicated
template stuff to work, which makes them hard to maintain. I've been
thinking about ways to simplify that for some time and had something
like this in mind - using a code generator that's at least a little more
understandable to code generate most of the complexity into a file that
is checkable.
Concurrently - I have a cool optimization in mind - but it requires that
we can move promises after polling, which is a contract change. I'm
going to work through the set of primitives we have in the coming weeks
and change that contract to enable the optimization.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Need the ability to override server-side keepalive permit without calls
default without affecting client-side settings.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->