Two new benchmarks here-in.
Benchmark 1: `bm_picker`
------
Measures various load balancing policies pick performance. For now we cover `pick_first` and `weighted_round_robin` at 1, 10, 100, 1000, 10000, and 100000 backends.
Today's output:
```
------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------
BM_Pick/pick_first/1 20.4 ns 20.4 ns 68285
BM_Pick/pick_first/10 20.6 ns 20.6 ns 68274
BM_Pick/pick_first/100 20.5 ns 20.5 ns 67817
BM_Pick/pick_first/1000 20.6 ns 20.6 ns 67347
BM_Pick/pick_first/10000 20.7 ns 20.7 ns 67317
BM_Pick/pick_first/100000 20.9 ns 20.9 ns 67385
BM_Pick/weighted_round_robin/1 54.7 ns 54.7 ns 26641
BM_Pick/weighted_round_robin/10 54.2 ns 54.2 ns 25828
BM_Pick/weighted_round_robin/100 55.2 ns 55.2 ns 26210
BM_Pick/weighted_round_robin/1000 54.1 ns 54.1 ns 25678
BM_Pick/weighted_round_robin/10000 77.3 ns 76.6 ns 15776
BM_Pick/weighted_round_robin/100000 148 ns 148 ns 9882
```
Benchmark 2: `bm_load_balanced_call_destination`
-----
This benchmark measures call performance when a call spine passes through a `LoadBalancedCallDestination`, and with `BM_LoadBalancedCallDestination` also the construction/destruction cost of this object.
We do not consider picker performance in this benchmark as it's separately covered by `bm_picker` above.
Today's output:
```
-----------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------------------------------------------------------------------------
BM_UnaryWithSpawnPerEnd<UnstartedCallDestinationFixture<LoadBalancedCallDestinationTraits>> 1255 ns 1255 ns 1076
BM_UnaryWithSpawnPerOp<UnstartedCallDestinationFixture<LoadBalancedCallDestinationTraits>> 1459 ns 1459 ns 939
BM_ClientToServerStreaming<UnstartedCallDestinationFixture<LoadBalancedCallDestinationTraits>> 209 ns 209 ns 6775
BM_LoadBalancedCallDestination 92.8 ns 92.8 ns 15063
```
Notes
------
There's some duplicated code between the benchmarks & tests -- this is ok -- as the tests evolve we'll likely want to add more checks to the fixtures, whereas as the benchmarks evolve we may well want to optimize the fixtures so that performance of the systems under test dominate more. That is, the duplicated code is expected to have different evolutionary tracks.
Closes#37052
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37052 from ctiller:moar-benchy 30c7072d87
PiperOrigin-RevId: 658181731
[Gpr_To_Absl_Logging] Convert some tracing macros to use << operator
Closes#37304
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37304 from tanvi-jagtap:remove_GRPC_SURFACE_TRACE_RETURNED_EVENT bc1b697fe1
PiperOrigin-RevId: 657919828
The Ruby artifact build is timing out at 1hr30m, specifically `build_artifact.ruby_native_gem_linux_aarch64-linux` in the `tools/internal_ci/linux/grpc_distribtests_ruby.sh` job. Most of the other Ruby builds take around 1hr15m, so the build time is increasing regardless.
@stanley-cheung this should probably be investigated. In the meantime, to hopefully unblock the v1.66 release, let's increase the build timeout.
Closes#37341
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37341 from drfloob:bump-ruby-artifact-build-timeout 8219cc9a33
PiperOrigin-RevId: 657747546
We're seeing a segfault issue in observability tests:
```
2024-07-26T09:09:18.422255153Z *** SIGSEGV received at time=1721984958 on cpu 0 ***
2024-07-26T09:09:18.424985750Z PC: @ 0x7e1acccb71c9 (unknown) (unknown)
2024-07-26T09:09:18.425333774Z @ 0x7e1ac714ed8c 64 absl::lts_20240116::WriteFailureInfo()
2024-07-26T09:09:18.425356717Z @ 0x7e1ac714ea15 272 absl::lts_20240116::AbslFailureSignalHandler()
2024-07-26T09:09:18.425368880Z @ 0x7e1accb98050 1584 (unknown)
2024-07-26T09:09:18.426117382Z @ 0x7e1ac77f458c 112 absl::lts_20240116::string_view::operator std::__cxx11::basic_string<><>()
2024-07-26T09:09:18.426647368Z @ 0x7e1ac78008df 688 grpc_observability::PythonOpenCensusCallTracer::PythonOpenCensusCallAttemptTracer::RecordEnd()
```
It points to `absl::string_view::operator std::__cxx11::basic_string<>()` which indicates the issue might be related to string conversion.
The most probable cause is that the `parent_->method_` string object is being destroyed before the `std::string` conversion is completed or used by `emplace_back`:
b056bc41d3/src/python/grpcio_observability/grpc_observability/client_call_tracer.cc (L325-L326)
Since it's difficult to manage the lifecycle of `method` in Python/Cython, this PR changes `method_` and `traget_` from `absl::string_view` to `std::string` so that they'll always be available.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37329
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37329 from XuanWang-Amos:fix_otel_segfault c579022529
PiperOrigin-RevId: 657366034
Avoid calling virtual and then immediately out-of-line function, just call the virtual and inline everything it needs.
Closes#37114
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37114 from ctiller:call-arena-inline 32a9781f83
PiperOrigin-RevId: 656092114
Split off from https://github.com/grpc/grpc/pull/37253.
These logs are from the security codebase, and we don't expect to see them during normal execution, so I'm not super concerned about these, but I'd still like to get to a point where we don't have any `LOG(INFO)` statements that are not guarded by a TraceFlag. So, if any of these logs are logs that we want enabled by default, I think we should do one of -
* move them to `LOG(ERROR)`
* protect them by a TraceFlag that is enabled by default. (This would still allow users to easily disable them.)
Closes#37296
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37296 from yashykt:SecurityLogSpan ea1c7718da
PiperOrigin-RevId: 655450344
Sample output:
```
➜ grpc git:(otel-metrics-benchmark) ✗ bazel-bin/test/cpp/microbenchmarks/bm_stats_plugin
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1721695619.615582 2126186 config.cc:257] gRPC experiments enabled: call_status_override_on_cancellation, call_tracer_in_transport, event_engine_dns, event_engine_listener, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache
2024-07-22T17:46:59-07:00
Running bazel-bin/test/cpp/microbenchmarks/bm_stats_plugin
Run on (48 X 2450 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x24)
L1 Instruction 32 KiB (x24)
L2 Unified 512 KiB (x24)
L3 Unified 32768 KiB (x3)
Load Average: 1.16, 0.85, 0.85
***WARNING*** Library was built as DEBUG. Timings may be affected.
---------------------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------------------
BM_AddCounterWithFakeStatsPlugin 1738 ns 1738 ns 404265
BM_AddCounterWithOTelPlugin 757 ns 757 ns 928142
I0000 00:00:1721695621.304593 2126186 test_config.cc:186] TestEnvironment ends
```
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37282
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37282 from yijiem:otel-metrics-benchmark eeba3dfb5e
PiperOrigin-RevId: 655286398
This PR changes many of the `LOG(INFO)` statements that are not guarded by TraceFlags to either be -
* guarded by a TraceFlag
* logged under `VLOG(2)`
Closes#37253
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37253 from yashykt:LoggingLevelFix 50d332e17a
PiperOrigin-RevId: 655248697
1. Fixing unit test that flags log noise.
2. This test was broken for many months. As a result , a lot of log noise was added. Removing the noise as a part of the PR.
3. If we want to retain any log line as `INFO` instead of `VLOG(2)`, please let me know, I will add it to allow list.
4. In this PR , we replace the old `gpr_set_log_function` mechanism with an `absl LogSink` . So here , `Send` function will do everything that `NoLog` used to do before.
Closes#37177
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37177 from tanvi-jagtap:fix_nologging_tests ad58e2fb79
PiperOrigin-RevId: 655209718
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
1. Function gpr_default_log has been deprecated. This function will be deleted in a few weeks.
1. This entire unit test is being re-written as a part of another PR. But that PR will take a while to merge. In the mean team I want to delete all instances of this function to prevent further backsliding.
https://github.com/grpc/grpc/pull/37177
PiperOrigin-RevId: 654998772
This setting has no utility in general open source, but is still useful in other environments. This PR ensures that there are no `debug` configurations when the default codegen CI runs. This simplifies the release process as well.
Closes#37277
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37277 from drfloob:gen-exp-no-dbg 6517fec6e4
PiperOrigin-RevId: 654928917
Change was created by the release automation script. See go/grpc-release.
Closes#37279
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37279 from drfloob:bump_dev_version_202407222027 4e6607411e
PiperOrigin-RevId: 654925894
SRV query
This is to be defensive to avoid allocating too much memory.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37158
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37158 from yijiem:ares-resolver-response-size 1bab0019e5
PiperOrigin-RevId: 654877815
Change was created by the release automation script. See go/grpc-release.
Closes#37276
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37276 from drfloob:bump_core_version_202407221838 7f287d8546
PiperOrigin-RevId: 654867615
Make sure call_tracers are deleted as long as c_call is unrefed.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37247
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37247 from XuanWang-Amos:fixHangingCallTracers b22a45cca9
PiperOrigin-RevId: 654865870
Noticed on a Core End2End test failure https://btx.cloud.google.com/invocations/dc3bf84d-e6ed-4b32-a24c-12489f981e46/targets/%2F%2Ftest%2Fcore%2Fend2end:cancel_with_status_test@poller%3Depoll1;config=56f5b09615e325097b100b58c41171656571290519a83c5d89a6067ef0283d46/log
```
F0000 00:00:1721017820.001684 87 tcp_server_posix.cc:354] Check failed: !s->shutdown
*** Check failure stack trace: ***
@ 0x7f32578da0e4 absl::lts_20240116::log_internal::LogMessage::SendToLog()
@ 0x7f32578d9a94 absl::lts_20240116::log_internal::LogMessage::Flush()
@ 0x7f32578da589 absl::lts_20240116::log_internal::LogMessageFatal::~LogMessageFatal()
@ 0x7f3257e340a1 tcp_server_unref()
@ 0x7f3258fcba8e grpc_core::Chttp2ServerListener::ActiveConnection::~ActiveConnection()
@ 0x7f3258fd19e7 grpc_event_engine::experimental::MemoryAllocator::New<>()::Wrapper::~Wrapper()
@ 0x7f3258fcc998 grpc_core::Chttp2ServerListener::OnAccept()
@ 0x7f3257e34962 absl::lts_20240116::internal_any_invocable::LocalInvoker<>()
@ 0x7f3257da6475 grpc_event_engine::experimental::PosixEngineListenerImpl::AsyncConnectionAcceptor::NotifyOnAccept()::$_1::operator()()
@ 0x7f3257da4437 grpc_event_engine::experimental::PosixEngineListenerImpl::AsyncConnectionAcceptor::NotifyOnAccept()
@ 0x7f3257da5fef absl::lts_20240116::base_internal::Callable::Invoke<>()
@ 0x7f3257dca50a grpc_event_engine::experimental::PosixEngineClosure::Run()
@ 0x7f3257c9013e grpc_event_engine::experimental::WorkStealingThreadPool::ThreadState::Step()
@ 0x7f3257c8fe48 grpc_event_engine::experimental::WorkStealingThreadPool::ThreadState::ThreadBody()
@ 0x7f3257c906df grpc_event_engine::experimental::WorkStealingThreadPool::WorkStealingThreadPoolImpl::StartThread()::$_0::__invoke()
@ 0x7f32579a106c grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix()::{lambda()#1}::__invoke()
@ 0x7f3257358609 start_thread
```
https://github.com/grpc/grpc/pull/36563 changed the refcounting mechanism incorrectly and we ended up taking a ref on the tcp server outside the critical region, resulting in a time-of-check-to-time-of-use bug, where we could end up reffing the tcp server when it is already 0, i.e., when the listener has already been shutdown. This results in an attempt to destroy the tcp server twice and an eventual crash.
Closes#37225
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37225 from yashykt:FixChttp2Bug bc1e8dfd34
PiperOrigin-RevId: 654850991
1. Function gpr_set_log_verbosity has been deprecated. This function will be deleted in a few weeks.
1. gRPC now internally uses absl logging. Earlier gRPC was using its own custom logging mechanism called gpr which had a whole set of functions beginning with gpr_.
1. This entire unit test is being re-written as a part of another PR. But that PR will take a while to merge. In the mean team I want to delete all instances of this function to prevent further backsliding.
https://github.com/grpc/grpc/pull/37177
PiperOrigin-RevId: 654639955
The oldest gcc version that gRPC supports as of today is gcc 7 but gcc 7 has an issue with template supports that gRPC already picked up. Recently we managed to fix it in gRPC library code but we still have some in our test code. Given that it's not easy to fix since it requires many trial error approach to find a way to satisfy gcc 7 and eventually gcc 7 will be dropped from our supported compilers, let's have this mitigation where just main grpc++ target is being tested for gcc 7 so that users can use grpc with it without having to fix this hairy issue.
Fixes https://github.com/grpc/grpc/issues/36751Closes#37257
PiperOrigin-RevId: 654076384
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37249
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37249 from yousukseung:xds_doc_update 3459771c1d
PiperOrigin-RevId: 653722093
Previously the registered callback's duration is set too low (10ms) that 2 different OTel callbacks get called from OTel consecutively would trigger the registered callback again, making the test flaky.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37243
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37243 from yijiem:otel-plugin-test-flake 70a1572cd2
PiperOrigin-RevId: 653717643
Tested by manually introduce a segfault, able to see the backtrace:
![Screenshot 2024-07-17 at 12 53 45 PM](https://github.com/user-attachments/assets/069e79ac-f215-4964-8b91-bd7a9e64ebfe)
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37241
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37241 from XuanWang-Amos:add_signal_handler_to_python_interop_client 7316c0ca51
PiperOrigin-RevId: 653671195