Isolate ping callback tracking to its own file.
Also takes the opportunity to simplify keepalive code by applying the
ping timeout to all pings.
Adds an experiment to allow multiple pings outstanding too (this was
originally an accidental behavior change of the work, but one that I
think may be useful going forward).
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
More changes as part of the dualstack design:
- Change resolver and LB policy APIs to support multiple addresses per
endpoint. Specifically, replace `ServerAddress` with
`EndpointAddresses`, which encodes more than one address. Per-address
channel args are retained at the same level, so they are now
per-endpoint. For now, `EndpointAddress` provides a single-address ctor
and a single-address accessor for backward compatibility, so
`ServerAdress` is an alias for `EndpointAddresses`; eventually, this
alias and the single-address methods will be removed.
- Add an `EndpointAddressSet` class, which represents an unordered set
of addresses to be used as a map key. This will be used in a number of
LB policies that need to store per-endpoint state.
- Change the LB policy API's `ChannelControlHelper::CreateSubchannel()`
method to take the address and per-endpoint channel args as separate
parameters, so that we don't need to construct a legacy `ServerAddress`
object as we create a new subchannel for each address in the endpoint.
- Change pick_first to flatten the address list.
- Change ring_hash to use `EndpointAddressSet` as the key for its
endpoint map, and to use the first address of the endpoint as the hash
key.
- Change WRR to use `EndpointAddressSet` as the key for its endpoint
weight map.
Note that support for multiple addresses per endpoint is guarded in RR
by the existing `round_robin_delegate_to_pick_fist` experiment and in
WRR by the existing `wrr_delegate_to_pick_first` experiment.
This PR does *not* include support for multiple addresses per endpoint
for the outlier_detection or xds_override_host LB policies; those will
come in subsequent PRs.
Waiting for an active channel takes less time in non-gamma test suites
because they only start waiting after already waited for the TD backends
to be created and report healthy.
In GAMMA, these resources are created asynchronously by Kubernetes. To
compensate for this, we double the timeout for GAMMA tests.
This change is to update the TD bootstrap generator for prod tests. This
is part of the TD release process. The new image has already been merged
to staging and tested locally in google3.
cc: @sergiitk PTAL.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Expand our fuzzing capabilities by allowing fuzzers to choose the bits
that go into random number distribution generators.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Passed manual run:
https://fusion2.corp.google.com/invocations/0e337438-8573-4b54-ba1c-242b1ad58586
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Bumping gcc 7 to 8 to workaround the ongoing gcc segfault problem when
building Protobuf C++. Currently Foundational C++ requires gcc 7 so this
is a temporary measure to make the test green. We need to either make a
decision to change the minimum version of gcc in the Foundational C++ or
find a way to support gcc 7 without gcc segfault soon.
Changes -
* CsmObservability doesn't need `SetTargetSelector`. Removed it.
* Added missing plumbing of `ServiceMeshLabelsInjector` in
`CsmObservability` to actually do the metadata exchange.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Support Python 3.12.
### Testing
* Passed all Distribution Tests.
* Also tested locally by installing 3.12 artifact.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Bazelify tests from "linux/grpc_bazel_build" kokoro job by creating 3
bazelified tests - "build with strict warning", "build with no_xds=True"
and "build with no_xds=True negative test".
- also make the original "linux/grpc_bazel_build" kokoro job a no-op
(since bazelified tests now provide the same coverage).
The deleted code here was overriding the
[intended](866fc41067/tools/run_tests/run_tests.py (L62))
default test env of `GRPC_VERBOSITY=DEBUG`.
I'm just deleting it because it looks like`GRPC_TRACE=api` is not having
any affect anyways, since it relies on `GRPC_VERBOSITY=DEBUG` which it
happens to be unsetting.
- make C-core basictests use `--build_only` when running as bazelified
tests. This is because the volume of C core tests is expected to grow
very significantly after https://github.com/grpc/grpc/pull/34419 and
currently the non-bazelified counterpart of the tests (the presubmit
grpc_basictests_c_cpp_build_only job) is also "build only".
- make the linux presubmit job `grpc_basictests_c_cpp_build_only` a
noop, since the bazelified tests already give the same coverage on
presubmit.
Added a separate distribtests for gRPC C++ DLL build on Windows. This
DLL build is a community support so it should be independently run from
the existing Windows distribtests. Actual DLL test will be added.
Working towards testing against CSM Observability. Added ability to
register a prometheus exporter with our Opentelemetry plugin. This will
allow our metrics to be available at the standard prometheus port
`:9464`.
Since many tests now run reliably as bazelified tests on RBE, we can
remove them from presubmit runs
to speedup testing of PRs.
(for now, these jobs will still run on master, they can be removed from
master as a followup).
- linux/grpc_distribtests_standalone is now fully covered by bazel test
suite
a3b4c797a7/tools/bazelify_tests/test/BUILD (L202),
setting them to `presubmit=False` will stop tests from running on PRs.
- stop running tests from grpc_bazel_distribtest on PR, instead rely on
bazel distribtests running as bazelified tests.
This is just an initial scope of tests. Much of this code was written by
@ginayeh . I just did the final polish/integration step.
There are 3 main tests included:
1. The GAMMA baseline test, including the [actual GAMMA
API](https://gateway-api.sigs.k8s.io/geps/gep-1426/) rather than vendor
extensions.
2. Kubernetes-based stateful session affinity tests, where the mesh
(including SSA configuration) is configured using CRDs
3. GCP-based stateful session affinity tests, where the mesh is
configured using the networkservices APIs directly
Tests 1 and 2 will run in both prod and GKE staging, i.e.
`container.googleapis.com` and
`staging-container.sandbox.googleapis.com`. The latter of these will act
as an early detection mechanism for regressions in the controller that
translates Gateway resources into networkservices resources.
Test 3 will run against `staging-networkservices.sandbox.googleapis.com`
to act as an early detection mechanism for regressions in the control
plane SSA implementation.
The scope of the SSA tests is still fairly minimal. Session drain
testing is in-progress but not included in this PR, though several
elements required for it are (grace period, pre-stop hook, and the
ability to kill a single pod in a deployment).
---------
Co-authored-by: Jung-Yu (Gina) Yeh <ginayeh@google.com>
Co-authored-by: Sergii Tkachenko <sergiitk@google.com>
This reverts commit 2db446aa9a.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This has been stable for a bit, everywhere that the EventEngine is
enabled. Going forward, I think the event_engine_{client|listener}
experiments can probably be used to regulate thread-pool-specific
issues.
---------
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
I've added channel args to `CreateNewServerCallTracer` on the
`ServerCallTracerFactory`.
The motivation is for CSM Observability where the OTel plugin will be
configured to only do stats on servers which are xDS enabled, so I plan
to check this via channel args.
In the future, with the new scopes for metrics, I think I'll be able to
change this to only check once per server or server connection instead
of per call.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
1. Switch to CMake 1.18.
2. Make ergonomic change to push_testing_images.sh to allow building
just a single image.
3. Update packages to reduce a number of vulnerabilities reported.