As discussed, this change adds scoping to `CsmObservability` such that when that object goes out of scope, new channels and servers don't record metrics. In the documentation, I've talked about how existing channels/servers are going to continue to record metrics but i've left room for us to change that behavior in the future.
The current way of doing this is through a global bool since there can only be one plugin right now, but we'll change this to use the global stats plugin registry in the future.
Closes#35835
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35835 from yashykt:DisableCsmObsOnScope 33a7c2f7bc
PiperOrigin-RevId: 605468117
Just to be future-proof, I'm amending the `void` return status of `BuildAndRegisterGlobal` in `OpenTelemetryPluginBuilder` to absl::Status.
This will be backported to 1.61 as well.
Closes#35659
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35659 from yashykt:UpdateOtelApiToAddStatus 07d3f41b8a
PiperOrigin-RevId: 601458408
We are no longer sure about this API, so re-experimentalizing it.
This PR will be backported to 1.61 as well.
Closes#35660
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35660 from yashykt:ReexperimentalizeCsmPluginOption 4f114a54d9
PiperOrigin-RevId: 601378856
Provide a public experimental API and bazel compatible build target for OpenTelemetry metrics.
Details -
* New `OpenTelemetryPluginBuilder` class that provides the API specified in https://github.com/grpc/proposal/blob/master/A66-otel-stats.md
* The existing `grpc::internal::OpenTelemetryPluginBuilder` class is moved to `grpc::internal::OpenTelemetryPluginBuilderImpl` for disambiguation.
* Renamed `OTel` in some instances to `OpenTelemetry` for consistency.
Closes#35348
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35348 from yashykt:OTelPublicApi e32328825e
PiperOrigin-RevId: 594271246
There's an ongoing discussion on whether we should have API to disable
default metrics. Removing this API till we have a decision on that.
I'm keeping the internal API for enabling/disabling metrics on the OTel
plugin for now, just not exposing it publicly.
Changes -
* CsmObservability doesn't need `SetTargetSelector`. Removed it.
* Added missing plumbing of `ServiceMeshLabelsInjector` in
`CsmObservability` to actually do the metadata exchange.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Going forward `[[nodiscard]]` is the portable way to spell this;
requires yanking a bunch of usage from after the param list to before.
We should further refine the GRPC_MUST_USE_RESULT macro to make it work
uniformly for any compilers that it doesn't today (most likely by making
it expand to nothing).
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This PR aims to de-experimentalize the APIs for GCP Observability.
We would have ideally wanted public feedback before declaring the APIs
stable, but we need stable APIs for GA.
Changes made after API review with @markdroth@veblush, @ctiller and the
entire Core/C++ team -
* The old experimental APIs `grpc::experimental::GcpObservabilityInit`
and `grpc::experimental::GcpObservabilityClose` are now deprecated and
will be deleted after v.1.55 release.
* The new API gets rid of the Close method and follows the RAII idiom
with a single `grpc::GcpObservability::Init()` call that returns an
`GcpObservability` object, the lifetime of which controls when
observability data is flushed.
* The `GcpObservability` class could in the future add more methods. For
example, a debug method that shows the current configuration.
* Document that GcpObservability initialization and flushing (on
`GcpObservability` destruction) are blocking calls.
* Document that gRPC is still usable if GcpObservability initialization
failed. (Added a test to prove the same).
* Since we don't have a good way to flush stats and tracing with
OpenCensus, the examples required users to sleep for 25 seconds. This
sleep is now part of `GcpObservability` destruction.
Additional Implementation details -
* `GcpObservability::Init` is now marked with `GRPC_MUST_USE_RESULT` to
make sure that the results are used. We ideally want users to store it,
but this is better than nothing.
* Added a note on GCP Observability lifetime guarantees.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This reverts commit 7bd9267f32.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This PR adds batching support for GCP Observability logging. So instead
of the naive creating a new RPC to cloud logging for each logging event,
we now batch the log events to meet one of the following requirements -
* Batch size of 1000
* Batch memory consumption of 1MB
* A timeout period of 1sec after which we flush the accumulated batch
irrespective of the size.
There can also be cases where for some reason the RPCs fail or the batch
just accumulates to a very large size(100000 entries or 10MB in size).
In such cases, we just log the events with gpr_log instead of just
continuing to accumulate.
Additionally, `GcpObservabilityClose()` has been added to gracefully
shut off logging where we block till all the currently logged events are
flushed. (We might be able to gracefully shut off stats and tracing in
the future too.)
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
* Revert "Revert "Revert "Revert "server: introduce ServerMetricRecorder API and move per-call reporting from a C++ interceptor to a C-core filter (#32106)" (#32272)" (#32279)" (#32293)"
This reverts commit 1f960697c5.
* Do not create CallMetricRecorder if call is null.
* Revert "Revert "server: introduce ServerMetricRecorder API and move per-call reporting from a C++ interceptor to a C-core filter (#32106)" (#32272)"
This reverts commit deb1e25543.
* Fix by caching call metric recording stuff in async request
PR #32106 caused msan errors in some tests while de-referencing the
server object where async calls are active after the server is
destroyed. Instead cache the ServerMetricRecorder pointer.
* copyright headers fixed
* clang fixes.
* Revert "Revert "ORCA: implement ORCA RPC service for OOB backend metric reporting (#29215)" (#29351)"
This reverts commit 71b355624f.
* move ORCA service to its own BUILD rule
* ORCA: implement ORCA RPC service for OOB backend metric reporting
* fix unused parameter error
* gen_upb_api
* add missing build deps
* increase test timing fudge factor
* add missing copyright header
* buildifier
* don't register as a generic service
* report interval defaults to min interval
* don't regenerate the response proto unless something changed
* use INTERNAL for proto parsing failure
* use absl::Duration in public API
Based on a handful of https://abseil.io/tips, it's generally advised to
only fully-qualify namespaces when in a `using` statement, or when it's
otherwise required for compilation. In all other cases, the general
recommendation is to not fully-qualify.
This change fixes most `grpc.*` namespace uses. There are potential
challenges in trying to make blanket changes to non-gRPC namespace uses,
such as `::testing`, since there is also a `grpc::testing` namespace.
Based on a handful of https://abseil.io/tips, it's generally advised to
only fully-qualify namespaces when in a `using` statement, or when it's
otherwise required for compilation. In all other cases, the general
recommendation is to not fully-qualify.
This change fixes most `grpc.*` namespace uses. There are potential
challenges in trying to make blanket changes to non-gRPC namespace uses,
such as `::testing`, since there is also a `grpc::testing` namespace.
* Implement C++ Admin Interface API
* Address reviewer's requests
* Remove static asserts for raw pointers
* Make sanity tests happy
* Windows: pacify conflict between ifndef and macros
* Disable admin services test on iOS
* Make iOS happy by:
* Letting grpcpp_admin conditionally depend on grpcpp_csds
* Fix an unexpected side-effect of dependency update