This PR adds CSM Observability testing capability in the PSM Interop testing framework. This PR mostly changes the framework Python code.
This adds a flag `enable_csm_observability` to the client / server deployment yaml file such that, when enabled, we will create a GMP `PodMonitoring` resource and pass the `--enable_csm_observability` to each language's client / server container (for them to actually enable the Prometheus endpoint)
I added a new test under `tests/csm/csm_observability_test.py`. This is basically a copy of the `tests/baseline_test.py` but with the `enable_csm_observability=True`.
Other PRs for this whole thing to work:
- https://github.com/grpc/grpc/pull/34752: The `PodMonitoring` resource yaml template
- https://github.com/grpc/grpc/pull/34832: Support for the `--enable_csm_observability` flag in the C++ client/server image
Closes#34835
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/34835 from stanley-cheung:csm-o11y-framework-changes 0b3d0eb7ed
PiperOrigin-RevId: 595502496
- `memory_pressure_controller` finally - allows deletion of pid_controller throughout the codebase
- `overload_protection` - one of the http2 rapid reset mitigations
- `red_max_concurrent_streams` - another http2 rapid reset mitigation
Closes#35426
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35426 from ctiller:new-years-cleanse 4651672e7e
PiperOrigin-RevId: 595205029
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#35292
PiperOrigin-RevId: 595188404
Remove the old `switch` library - this used to be an implementation detail of `Seq`, `TrySeq` - but has become unused.
Add a new user facing primitive `Switch` that fills a similar role to `switch` in C++ - selecting a promise to execute based on a primitive discriminator - much like `If` allows selection based on a boolean discriminator now.
A future change will optimize this to actually lower the `Switch` into an actual `switch` statement, but for right now I want to get the functionality in.
Closes#35424
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35424 from ctiller:switchy 5308a914c6
PiperOrigin-RevId: 595140965
Whilst here, eliminate unnecessary mutexes and streamline some complexity in the read variants.
Closes#35409
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35409 from ctiller:pbe 4f9588101a
PiperOrigin-RevId: 595006455
Provide a public experimental API and bazel compatible build target for OpenTelemetry metrics.
Details -
* New `OpenTelemetryPluginBuilder` class that provides the API specified in https://github.com/grpc/proposal/blob/master/A66-otel-stats.md
* The existing `grpc::internal::OpenTelemetryPluginBuilder` class is moved to `grpc::internal::OpenTelemetryPluginBuilderImpl` for disambiguation.
* Renamed `OTel` in some instances to `OpenTelemetry` for consistency.
Closes#35348
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35348 from yashykt:OTelPublicApi e32328825e
PiperOrigin-RevId: 594271246
Adds temporary `call.cc` and `connected_channel.cc` scaffolding to run `CallInterceptor`/`CallHandler` style calls.
This will get ripped out as soon as the v3 transition is completed.
Closes#35312
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35312 from ctiller:v3-accept ae0bf81f8b
PiperOrigin-RevId: 594128029
It turned out that the previous change missed two things which this PR
has
- Fix function `_dockerized_genrule` not to have `timeout` and `flaky`
which Bazel doesn't have. (Bazel 7 may either drop these arguments or
become more strict about passing unrecognized ones)
- Disabled python bazel distribution tests with Bazel 7. This needs to
be addressed by https://github.com/grpc/grpc/issues/35391.
Currently, each subchannel wrapper stores a ref to the policy and its key in the policy's subchannel map, and it looks up its entry in the map whenever it needs to modify that entry. There's some complexity due to the need to avoid deadlocks in the case where we remove the last strong ref to a subchannel wrapper from a map entry. This approach has a number of problems:
- The subchannel wrapper is dropping its key when it gets orphaned, meaning that it will *never* actually remove itself from the map entry when it is destroyed, which is not what we want. (This isn't actually causing a bug, but it does mean that we'll never delete the subchannel wrapper, even when it is really unused.)
- Having the subchannel wrapper look up its key in the map every time it needs to modfy its entry is fairly inefficient, especially if there are a large number of endpoints.
- There is a race condition that was accidentally introduced in #34472. The subchannel wrapper's key is being modified when the subchannel wrapper is orphaned, but that PR changed the picker to read the same value without any synchronization between the two, and we didn't notice the bug or catch it in any tests.
- The code is fairly hard to understand, with a bunch of special cases that are not obvious to the reader.
This PR addresses those problems by making the entries in the subchannel map be ref-counted, where a ref is held both by the map and by each subchannel wrapper. Specific changes:
- Because the wrapper holds a ref directly to the map entry, there is no longer any need for a map lookup every time the subchannel wrapper needs to access its map entry.
- We now avoid deadlocks by waiting until after we've released the lock to drop refs to subchannel wrappers, so there is no more need to modify the internal state of a subchannel wrapper.
- We now remove subchannel wrappers from the map entry when they are orphaned, so there is no longer any need to hold a weak ref in the map entry; instead, we now just use a raw pointer.
- The connectivity state is now stored in the map entry instead of in each individual subchannel wrapper. And we no longer need to use an atomic for it, since we are always holding the lock when it is accessed.
- All state guarded by the mutex (other than the subchannel map itself) is now in the subchannel entry, and I have added lock annotations so that the compiler can enforce the lock semantics.
This PR paves the way for subsequent work that will make SSA work across priorities (see in-progress [gRFC A75](https://github.com/grpc/proposal/pull/405)), where we will need to generalize the behavior such that we hold strong refs to subchannels in any state (not just DRAINING) when the child policy is not holding its own refs.
Closes#35379
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35379 from markdroth:xds_ssa_tsan_fix 4927e04eb1
PiperOrigin-RevId: 594015497
Rename `saved_errno` to `connect_errno`.
Avoid relying on `errno` being zero if `connect(2)` does not fail.
Slightly linearize control flow.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#35356
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35356 from benjaminp:connect-errno 0dabdf0562
PiperOrigin-RevId: 593819124
- Fixed the bazel distrib tests with Bazel 7 by disabling bzlmod option.
- Added a new note for bzlmod to the doc.
Closes#35390
PiperOrigin-RevId: 593816700
- Added Bazel 7 to the support bazel versions.
- Changed the default Bazel version to 7.
- Fixed Android Binder build issue.
Closes#35362
PiperOrigin-RevId: 592946781
This is a prerequisite change to start supporting Bazel 7. Changes are
- Disabled bzlmod which Bazel 7 begins to enable by default. This eventually needs to be done to support bzlmod but not now.
- Upgraded some bazel rule dependencies which are required to support Bazel 7.
- Using Python 3 explcitly as Bazel 7 begins to reject Python 2.
Note that this isn't enough to enable Bazel 7 by default and another PR will follow for that.
Closes#35374
PiperOrigin-RevId: 592931675
Simple `assert` statements don't help much to know what needs to be done. Instead, explicit error messages will let us know what's wrong which is helpful to know what to look at.
Closes#35375
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35375 from veblush:check-work 0733499c31
PiperOrigin-RevId: 592920747
Due to an internal issue, some code from objective-c folder was copied
into objective_c .
Trying to undo that.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Fix: https://github.com/grpc/grpc/issues/35085
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#35325
PiperOrigin-RevId: 592635611
Fixes#34929.
This PR hash-pins all Actions used in workflows and sets up dependabot
to keep them up-to-date.
Dependabot will send at most one PR per month. That PR will update the
hashes and version comments of all Actions with new versions.
I also suggest you enable Dependabot Security Updates in the repo's
[Code security &
analysis](https://github.com/grpc/grpc/settings/security_analysis)
settings (if you haven't already). This will make Dependabot send a PR
as soon as a dependency is found to have a vulnerability.
---------
Signed-off-by: Pedro Kaj Kjellerup Nacht <pnacht@google.com>
The Server Reflection Protocol v1 is already released. I think v1 is a better link to the Protocol, not v1alpha now.
Closes#35330
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35330 from y-yagi:patch-1 a89cb60b1e
PiperOrigin-RevId: 592356694
`grpc_tcp_client_create_from_prepared_fd` distinguishes "in-progress" `connect(2)` errors from fatal errors. However, it does a bunch of external calls between calling `connect(2`) and checking the `errno`. These calls may not preserve `errno`.
This change parallels defensive `errno` saving pattern in the event_engine.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#35064
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35064 from benjaminp:save-errno a01f6b4309
PiperOrigin-RevId: 592350454
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#35353
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35353 from yijiem:fix-ios-dns-test f84d29a9db
PiperOrigin-RevId: 592339589
Previously, `RefCountedPtr<>` and `WeakRefCountedPtr<>` incorrectly allowed
implicit casting of any type to any other type. This hadn't caused a
problem until recently, but now that it has, we need to fix it. I have
fixed this by changing these smart pointer types to allow type
conversions only when the type used is convertible to the type of the
smart pointer. This means that if `Subclass` inherits from `Base`, then
we can set a `RefCountedPtr<BaseClass>` to a value of type
`RefCountedPtr<Subclass>`, but we cannot do the reverse.
We had been (ab)using this bug to make it more convenient to deal with
down-casting in subclasses of ref-counted types. For example, because
`Resolver` inherits from `InternallyRefCounted<Resolver>`, calling
`Ref()` on a subclass of `Resolver` will return `RefCountedPtr<Resolver>`
rather than returning the subclass's type. The ability to implicitly
convert to the subclass type made this a bit easier to deal with. Now
that that ability is gone, we need a different way of dealing with that
problem.
I considered several ways of dealing with this, but none of them are
quite as ergonomic as I would ideally like. For now, I've settled on
requiring callers to explicitly down-cast as needed, although I have
provided some utility functions to make this slightly easier:
- `RefCounted<>`, `InternallyRefCounted<>`, and `DualRefCounted<>` all
provide a templated `RefAsSubclass<>()` method that will return a new
ref as a subclass. The type used with `RefAsSubclass()` must be a
subclass of the type passed to `RefCounted<>`, `InternallyRefCounted<>`,
or `DualRefCounted<>`.
- In addition, `DualRefCounted<>` provides a templated `WeakRefAsSubclass<T>()`
method. This is the same as `RefAsSubclass()`, except that it returns
a weak ref instead of a strong ref.
- In `RefCountedPtr<>`, I have added a new `Ref()` method that takes
debug tracing parameters. This can be used instead of calling `Ref()`
on the underlying object in cases where the caller already has a
`RefCountedPtr<>` and is calling `Ref()` only to specify the debug
tracing parameters. Using this method on `RefCountedPtr<>` is more
ergonomic, because the smart pointer is already using the right
subclass, so no down-casting is needed.
- In `WeakRefCountedPtr<>`, I have added a new `WeakRef()` method that
takes debug tracing parameters. This is the same as the new `Ref()`
method on `RefCountedPtr<>`.
- In both `RefCountedPtr<>` and `WeakRefCountedPtr<>`, I have added a
templated `TakeAsSubclass<>()` method that takes the ref out of the
smart pointer and returns a new smart pointer of the down-casted type.
Just as with the `RefAsSubclass()` method above, the type used with
`TakeAsSubclass()` must be a subclass of the type passed to
`RefCountedPtr<>` or `WeakRefCountedPtr<>`.
Note that I have *not* provided an `AsSubclass<>()` variant of the
`RefIfNonZero()` methods. Those methods are used relatively rarely, so
it's not as important for them to be quite so ergonomic. Callers of
these methods that need to down-cast can use
`RefIfNonZero().TakeAsSubclass<>()`.
PiperOrigin-RevId: 592327447
This fixes#21619. This experimental ALPN protocol has already been removed from the other gRPC stacks.
Closes#34876
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/34876 from matthewstevenson88:remove-grpc-exp 1cb9d084ea
PiperOrigin-RevId: 592080195
When an error occurs in `TrySeq`, log the error.
Also, in the trace statements, capture the file/line that the sequence was constructed at, and log that with the traces.
Closes#35319
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35319 from ctiller:trace-seq bc0d47395f
PiperOrigin-RevId: 591999228
upb currently defines upb_Message to be void but that may soon change, at which point the gRPC build would break without these casts.
This time for sure! With C++-style casts that also actually build!
Closes#35332
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35332 from ericsalo:master e7768a37c9
PiperOrigin-RevId: 591959873