interop-testing: update the Interop-test-descriptions doc to reflect the soak concurrency (#38126)

-  Update the Interop-test-descriptions doc to reflect the concurrency improvement in the rpc_soak and channel_soak tests.

- PTAL @apolcyn

Closes #38126

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/38126 from zbilun:soak-doc-update e8ae5ae9a6
PiperOrigin-RevId: 699251174
pull/37751/merge
zbilun 6 days ago committed by Copybara-Service
parent c6dccd4cb1
commit 7826ddca68
  1. 39
      doc/interop-test-descriptions.md

@ -1005,14 +1005,21 @@ Client asserts:
### rpc_soak
The client performs many large_unary RPCs in sequence over the same channel.
The client records the latency and status of each RPC in some data structure.
If the test ever consumes `soak_overall_timeout_seconds` seconds and still hasn't
completed `soak_iterations` RPCs, then the test should discontinue sending RPCs
as soon as possible. After performing all RPCs, the test should examine
previously recorded RPC latency and status results in a second pass and fail if
either:
The total number of RPCs to execute is controlled by the `soak_iterations`
parameter, which defaults to 10. The number of threads used to execute RPCs
is controlled by `soak_num_threads`. By default, `soak_num_threads` is set to 1.
a) not all `soak_iterations` RPCs were completed
The client records the latency and status of each RPC in
thread-specific data structure, which are later aggregated to form the overall
results. If the test ever consumes `soak_overall_timeout_seconds` seconds
and still hasn't completed `soak_iterations` RPCs, then the test should
discontinue sending RPCs as soon as possible. Each thread should independently
track its progress and stop once the overall timeout is reached.
After performing all RPCs, the test should examine the previously aggregated RPC
latency and status results from all threads in a second pass and fail if either:
a) not all `soak_iterations` RPCs were completed across all threads
b) the sum of RPCs that either completed with a non-OK status or exceeded
`max_acceptable_per_rpc_latency_ms` exceeds `soak_max_failures`
@ -1029,10 +1036,15 @@ results of each iteration (i.e. RPC) in a format the matches the following
regexes:
- Upon success:
- `soak iteration: \d+ elapsed_ms: \d+ peer: \S+ succeeded`
- `thread_id: \d+ soak iteration: \d+ elapsed_ms: \d+ peer: \S+ server_uri:
\S+ succeeded`
- Upon failure:
- `soak iteration: \d+ elapsed_ms: \d+ peer: \S+ failed:`
- `thread_id: \d+ soak iteration: \d+ elapsed_ms: \d+ peer: \S+ server_uri:
\S+ failed`
- Thread-specific logs will include the thread_id, helping to track performance
across threads.
This test must be configurable via a few different command line flags:
@ -1057,6 +1069,14 @@ This test must be configurable via a few different command line flags:
* `soak_min_time_ms_between_rpcs`: The minimum time in milliseconds between
consecutive RPCs. Useful for limiting QPS.
* `soak_num_threads`: Specifies the number of threads to use for concurrently
executing the soak test. Each thread performs `soak_iterations / soak_num_threads`
RPCs.
This value defaults to 1 (i.e., no concurrency) but can be
increased for concurrent execution. The total soak_iterations must be
divisible by soak_num_threads.
The following is optional but encouraged to improve debuggability:
* Implementations should log the number of milliseconds that each RPC takes.
@ -1078,7 +1098,6 @@ included in that latency measurement (channel teardown semantics differ widely
between languages). This latency measurement should also be the value that is
logged and recorded in the latency histogram.
### orca_per_rpc
[orca_per_rpc]: #orca_per_rpc

Loading…
Cancel
Save