We're seeing segfault in Python CSM tests:
```
2024-08-03T09:49:45.720555997Z *** SIGSEGV received at time=1722678585 on cpu 0 ***
2024-08-03T09:49:45.721761998Z PC: @ 0x7847ffd5c1c9 (unknown) (unknown)
2024-08-03T09:49:45.722070502Z @ 0x7847fa309d8c 64 absl::lts_20240116::WriteFailureInfo()
2024-08-03T09:49:45.722175904Z @ 0x7847fa309a15 272 absl::lts_20240116::AbslFailureSignalHandler()
2024-08-03T09:49:45.722187675Z @ 0x7847ffc3d050 1592 (unknown)
2024-08-03T09:49:45.723432238Z @ 0x7847e97f9390 (unknown) (unknown)
2024-08-03T09:49:45.723487349Z @ ... and at least 1 more frames
2024-08-03T09:49:45.829702781Z [INFO tini (1)] Spawned child process '/xds_interop_client' with pid '7'
2024-08-03T09:49:45.829766869Z [DEBUG tini (1)] Received SIGCHLD
2024-08-03T09:49:45.829778749Z [DEBUG tini (1)] Reaped child with pid: '7'
2024-08-03T09:49:45.829787070Z [INFO tini (1)] Main child exited with signal (with signal 'Segmentation fault')
```
### The issue
After investigation, we found that the call tracer was deleted before `RecordEnd` was called.
### Why this fix
* To fix this, we decide to use arena to manage the life cycle of CallTracer.
* Since CallTracer was created in another shard object library (`grpcio_observability`) which don't have a dependency on grpc core, we can't use `grpc_core::Arena` directly when creating the call tracer.
* As a workaround, we created a wrapper class `ClientCallTracerWrapper` to wrap the CallTracer and created another core API `grpc_call_tracer_set_and_manage` so that we can manage the life cycle of CallTracer use the wrapper class.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37460
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37460 from XuanWang-Amos:fix_otel_segfault 33c0b98c64
PiperOrigin-RevId: 662966853
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37442
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37442 from yijiem:migrate-iomgr-getdnsresolver 7b3ed7d980
PiperOrigin-RevId: 662957279
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37466
PiperOrigin-RevId: 662606899
The peer_state_based_framing experiment is not used. The other experiment has been rolled out to 100% in prod for a while now. The expiry date of a few other experiments are updated.
PiperOrigin-RevId: 662565880
[Gpr_To_Absl_Logging] Using GRPC_TRACE_LOG instead of GRPC_TRACE_FLAG_ENABLED
Closes#37387
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37387 from tanvi-jagtap:GRPC_TRACE_FLAG_ENABLED_06 1b87cfece9
PiperOrigin-RevId: 662349186
[Gpr_To_Absl_Logging] Removing EXECUTOR_TRACE and replace with GRPC_TRACE_LOG
Closes#37433
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37433 from tanvi-jagtap:remove_EXECUTOR_TRACE 2a7bdc97ad
PiperOrigin-RevId: 661498614
[Gpr_To_Absl_Logging] Replacing GRPC_ARES_RESOLVER_TRACE_LOG with GRPC_TRACE_LOG
Also, moved definition of GRPC_ARES_RESOLVER_TRACE_LOG from header to the cpp where it is being used to avoid more people from trying to use this macro
Closes#37434
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37434 from tanvi-jagtap:remove_GRPC_ARES_RESOLVER_TRACE_LOG e7883d70e9
PiperOrigin-RevId: 661297848
[Gpr_To_Absl_Logging] Remove gpr_should_log from the header file
Closes#37420
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37420 from tanvi-jagtap:clean_up_ruby_php_01 ce303fa808
PiperOrigin-RevId: 661118631
[Gpr_To_Absl_Logging] Using GRPC_TRACE_LOG instead of GRPC_TRACE_FLAG_ENABLED
Closes#37389
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37389 from tanvi-jagtap:GRPC_TRACE_FLAG_ENABLED_04 bcf1c6255f
PiperOrigin-RevId: 661111633
Closes#36943
This PR adds `original_request` to any `_reflection_pb2.ServerReflectionResponse` generated by `grpc_reflection.v1alpha`
Closes#36944
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/36944 from Drarig29:corentin.girard/add-original-request-to-ServerReflectionResponse e6a94789e1
PiperOrigin-RevId: 660746169
[Gpr_To_Absl_Logging] Using GRPC_TRACE_LOG instead of GRPC_TRACE_FLAG_ENABLED
Closes#37388
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37388 from tanvi-jagtap:GRPC_TRACE_FLAG_ENABLED_05 079535c5e1
PiperOrigin-RevId: 660693555
Fixes https://github.com/grpc/grpc/issues/37234
Following up on the problem described in https://github.com/grpc/grpc/pull/36903, there are a number of paths in `client_server_spec.rb` and a few other tests where client call objects can leak due to RPC lifecycles not being properly completed, leading to a thread not terminating.
Some of the tests, which don't use the surface-level APIs, are changed to manually close calls (and not rely on GC which might not happen before shutdown of ruby threads). `client_server_spec.rb` is updated to use surface level APIs, which manages call lifecycles correctly (this also improves the test's fidelity).
While we're here: expose `cancel_with_status` on call operations. This was only accidentally private so far. The test refactoring caught it.
Closes#37410
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37410 from apolcyn:fix_call_leak b23047251c
PiperOrigin-RevId: 660430463
Includes a few changes to pollset stuff to make it easier to not use pollsets (which I think is going to be generally helpful in the coming months)
Closes#37397
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37397 from ctiller:client-chicken 1099cd500a
PiperOrigin-RevId: 660014128
This is so that the client can retry in this scenario. Fix#37306
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37371
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37371 from yijiem:better-ares-error-code 6f367d1051
PiperOrigin-RevId: 659662762
[Gpr_To_Absl_Logging] Using GRPC_TRACE_LOG instead of GRPC_TRACE_FLAG_ENABLED
Used find and replace for this PR
Find String :
```
if \(GRPC_TRACE_FLAG_ENABLED\((.*)\)\) {
(.*)LOG\(INFO\)(.*)
(.*);
(.*)}(.*)
```
Replace String
`GRPC_TRACE_LOG($1, INFO) $3 $4;`
Closes#37350
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37350 from tanvi-jagtap:GRPC_TRACE_FLAG_ENABLED_02 94a14ae3c5
PiperOrigin-RevId: 659040558
[Gpr_To_Absl_Logging] Using GRPC_TRACE_LOG instead of GRPC_TRACE_FLAG_ENABLED
Used find and replace for this PR
Find String :
```
if \(GRPC_TRACE_FLAG_ENABLED\((.*)\)\) {
(.*)LOG\(INFO\)(.*)
(.*);
(.*)}(.*)
```
Replace String
`GRPC_TRACE_LOG($1, INFO) $3 $4;`
Closes#37349
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37349 from tanvi-jagtap:GRPC_TRACE_FLAG_ENABLED_01 70ca4b8dce
PiperOrigin-RevId: 659039786
[Gpr_To_Absl_Logging] Using GRPC_TRACE_LOG instead of GRPC_TRACE_FLAG_ENABLED
Used find and replace for this PR
Find String :
```
if \(GRPC_TRACE_FLAG_ENABLED\((.*)\)\) {
(.*)LOG\(INFO\)(.*)
(.*);
(.*)}(.*)
```
Replace String
`GRPC_TRACE_LOG($1, INFO) $3 $4;`
Closes#37351
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37351 from tanvi-jagtap:GRPC_TRACE_FLAG_ENABLED_03 6b9c0f2737
PiperOrigin-RevId: 659033759
This seems to cut ~5 nanoseconds (10%) off of the `BM_AddCounterWithOTelPlugin` benchmark.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37311
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37311 from yijiem:otel-metrics-benchmark 2f8c975cfc
PiperOrigin-RevId: 658915810
Two new benchmarks here-in.
Benchmark 1: `bm_picker`
------
Measures various load balancing policies pick performance. For now we cover `pick_first` and `weighted_round_robin` at 1, 10, 100, 1000, 10000, and 100000 backends.
Today's output:
```
------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------
BM_Pick/pick_first/1 20.4 ns 20.4 ns 68285
BM_Pick/pick_first/10 20.6 ns 20.6 ns 68274
BM_Pick/pick_first/100 20.5 ns 20.5 ns 67817
BM_Pick/pick_first/1000 20.6 ns 20.6 ns 67347
BM_Pick/pick_first/10000 20.7 ns 20.7 ns 67317
BM_Pick/pick_first/100000 20.9 ns 20.9 ns 67385
BM_Pick/weighted_round_robin/1 54.7 ns 54.7 ns 26641
BM_Pick/weighted_round_robin/10 54.2 ns 54.2 ns 25828
BM_Pick/weighted_round_robin/100 55.2 ns 55.2 ns 26210
BM_Pick/weighted_round_robin/1000 54.1 ns 54.1 ns 25678
BM_Pick/weighted_round_robin/10000 77.3 ns 76.6 ns 15776
BM_Pick/weighted_round_robin/100000 148 ns 148 ns 9882
```
Benchmark 2: `bm_load_balanced_call_destination`
-----
This benchmark measures call performance when a call spine passes through a `LoadBalancedCallDestination`, and with `BM_LoadBalancedCallDestination` also the construction/destruction cost of this object.
We do not consider picker performance in this benchmark as it's separately covered by `bm_picker` above.
Today's output:
```
-----------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------------------------------------------------------------------------
BM_UnaryWithSpawnPerEnd<UnstartedCallDestinationFixture<LoadBalancedCallDestinationTraits>> 1255 ns 1255 ns 1076
BM_UnaryWithSpawnPerOp<UnstartedCallDestinationFixture<LoadBalancedCallDestinationTraits>> 1459 ns 1459 ns 939
BM_ClientToServerStreaming<UnstartedCallDestinationFixture<LoadBalancedCallDestinationTraits>> 209 ns 209 ns 6775
BM_LoadBalancedCallDestination 92.8 ns 92.8 ns 15063
```
Notes
------
There's some duplicated code between the benchmarks & tests -- this is ok -- as the tests evolve we'll likely want to add more checks to the fixtures, whereas as the benchmarks evolve we may well want to optimize the fixtures so that performance of the systems under test dominate more. That is, the duplicated code is expected to have different evolutionary tracks.
Closes#37052
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37052 from ctiller:moar-benchy 30c7072d87
PiperOrigin-RevId: 658181731
[Gpr_To_Absl_Logging] Convert some tracing macros to use << operator
Closes#37304
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37304 from tanvi-jagtap:remove_GRPC_SURFACE_TRACE_RETURNED_EVENT bc1b697fe1
PiperOrigin-RevId: 657919828
We're seeing a segfault issue in observability tests:
```
2024-07-26T09:09:18.422255153Z *** SIGSEGV received at time=1721984958 on cpu 0 ***
2024-07-26T09:09:18.424985750Z PC: @ 0x7e1acccb71c9 (unknown) (unknown)
2024-07-26T09:09:18.425333774Z @ 0x7e1ac714ed8c 64 absl::lts_20240116::WriteFailureInfo()
2024-07-26T09:09:18.425356717Z @ 0x7e1ac714ea15 272 absl::lts_20240116::AbslFailureSignalHandler()
2024-07-26T09:09:18.425368880Z @ 0x7e1accb98050 1584 (unknown)
2024-07-26T09:09:18.426117382Z @ 0x7e1ac77f458c 112 absl::lts_20240116::string_view::operator std::__cxx11::basic_string<><>()
2024-07-26T09:09:18.426647368Z @ 0x7e1ac78008df 688 grpc_observability::PythonOpenCensusCallTracer::PythonOpenCensusCallAttemptTracer::RecordEnd()
```
It points to `absl::string_view::operator std::__cxx11::basic_string<>()` which indicates the issue might be related to string conversion.
The most probable cause is that the `parent_->method_` string object is being destroyed before the `std::string` conversion is completed or used by `emplace_back`:
b056bc41d3/src/python/grpcio_observability/grpc_observability/client_call_tracer.cc (L325-L326)
Since it's difficult to manage the lifecycle of `method` in Python/Cython, this PR changes `method_` and `traget_` from `absl::string_view` to `std::string` so that they'll always be available.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37329
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37329 from XuanWang-Amos:fix_otel_segfault c579022529
PiperOrigin-RevId: 657366034
Avoid calling virtual and then immediately out-of-line function, just call the virtual and inline everything it needs.
Closes#37114
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37114 from ctiller:call-arena-inline 32a9781f83
PiperOrigin-RevId: 656092114