mirror of https://github.com/grpc/grpc.git
[benchmark] Measure all threads' CPU usage in callback benchmarks (#37936)
google/benchmark only measures the main thread's CPU by default, and benchmark iteration depends on CPU utilization. These benchmarks were taking a long time since the callback API uses a lot of internal threads, and the benchmarks weren't producing useful metrics. See the documentation here https://google.github.io/benchmark/user_guide.html#cpu-timers
Test command for master: `bazel test --config=opt --test_output=streamed //test/cpp/microbenchmarks:bm_callback_streaming_ping_pong --test_arg='--benchmark_min_time=0.05s' --test_arg='--benchmark_min_warmup_time=0.01' --test_arg='--benchmark_filter=.*<MinIn.*/*/1$'`
For this PR, change the filter to `--test_arg='--benchmark_filter=.*<MinIn.*/*/1/proc.*'`
Before this change (scroll right for `inf` bytes/s, and note the 10k iterations even though the specified benchmark time had been reached):
```
---------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
---------------------------------------------------------------------------------------------------------------------------------------
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/0/1 38932 ns 1052 ns 10000 bytes_per_second=0/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/1/1 39002 ns 0.000 ns 10000 bytes_per_second=inf/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/8/1 39457 ns 0.000 ns 10000 bytes_per_second=inf/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/64/1 51693 ns 0.000 ns 10000 bytes_per_second=inf/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/512/1 41174 ns 0.000 ns 10000 bytes_per_second=inf/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/4096/1 45006 ns 0.000 ns 10000 bytes_per_second=inf/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/32768/1 70990 ns 0.000 ns 10000 bytes_per_second=inf/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/262144/1 302831 ns 0.000 ns 1000 bytes_per_second=inf/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/2097152/1 2202385 ns 0.000 ns 1000 bytes_per_second=inf/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/16777216/1 30137348 ns 0.000 ns 10 bytes_per_second=inf/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/134217728/1 437821865 ns 0.000 ns 1 bytes_per_second=inf/s
...
//test/cpp/microbenchmarks:bm_callback_streaming_ping_pong PASSED in 25.7s
```
After this change:
```
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
--------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/0/1/process_time/real_time 38233 ns 75471 ns 1768 bytes_per_second=0/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/1/1/process_time/real_time 38847 ns 76616 ns 1781 bytes_per_second=50.2778Ki/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/8/1/process_time/real_time 38848 ns 76929 ns 1758 bytes_per_second=402.21Ki/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/64/1/process_time/real_time 49873 ns 95929 ns 1685 bytes_per_second=2.44761Mi/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/512/1/process_time/real_time 41606 ns 80703 ns 1431 bytes_per_second=23.4718Mi/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/4096/1/process_time/real_time 45495 ns 86097 ns 1335 bytes_per_second=171.722Mi/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/32768/1/process_time/real_time 71506 ns 117093 ns 806 bytes_per_second=874.048Mi/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/262144/1/process_time/real_time 353012 ns 433389 ns 183 bytes_per_second=1.38319Gi/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/2097152/1/process_time/real_time 3059811 ns 3215404 ns 18 bytes_per_second=1.27663Gi/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/16777216/1/process_time/real_time 34951210 ns 35279138 ns 2 bytes_per_second=915.562Mi/s
BM_CallbackBidiStreaming<MinInProcess, NoOpMutator, NoOpMutator>/134217728/1/process_time/real_time 443034887 ns 444528037 ns 1 bytes_per_second=577.833Mi/s
...
//test/cpp/microbenchmarks:bm_callback_streaming_ping_pong PASSED in 5.1s
```
Closes #37936
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37936 from drfloob:bm-callback-measures-all-thread-cpu d5ba8420bd
PiperOrigin-RevId: 686640919
pull/37594/merge
parent
bd4792dab6
commit
0b13caed53
1 changed files with 1 additions and 0 deletions
Loading…
Reference in new issue