diff --git a/doc/interop-test-descriptions.md b/doc/interop-test-descriptions.md index 2ae7ffb6b10..1026f019de8 100755 --- a/doc/interop-test-descriptions.md +++ b/doc/interop-test-descriptions.md @@ -1005,14 +1005,21 @@ Client asserts: ### rpc_soak The client performs many large_unary RPCs in sequence over the same channel. -The client records the latency and status of each RPC in some data structure. -If the test ever consumes `soak_overall_timeout_seconds` seconds and still hasn't -completed `soak_iterations` RPCs, then the test should discontinue sending RPCs -as soon as possible. After performing all RPCs, the test should examine -previously recorded RPC latency and status results in a second pass and fail if -either: +The total number of RPCs to execute is controlled by the `soak_iterations` +parameter, which defaults to 10. The number of threads used to execute RPCs +is controlled by `soak_num_threads`. By default, `soak_num_threads` is set to 1. -a) not all `soak_iterations` RPCs were completed +The client records the latency and status of each RPC in +thread-specific data structure, which are later aggregated to form the overall +results. If the test ever consumes `soak_overall_timeout_seconds` seconds +and still hasn't completed `soak_iterations` RPCs, then the test should +discontinue sending RPCs as soon as possible. Each thread should independently +track its progress and stop once the overall timeout is reached. + +After performing all RPCs, the test should examine the previously aggregated RPC +latency and status results from all threads in a second pass and fail if either: + +a) not all `soak_iterations` RPCs were completed across all threads b) the sum of RPCs that either completed with a non-OK status or exceeded `max_acceptable_per_rpc_latency_ms` exceeds `soak_max_failures` @@ -1029,10 +1036,15 @@ results of each iteration (i.e. RPC) in a format the matches the following regexes: - Upon success: - - `soak iteration: \d+ elapsed_ms: \d+ peer: \S+ succeeded` + - `thread_id: \d+ soak iteration: \d+ elapsed_ms: \d+ peer: \S+ server_uri: + \S+ succeeded` - Upon failure: - - `soak iteration: \d+ elapsed_ms: \d+ peer: \S+ failed:` + - `thread_id: \d+ soak iteration: \d+ elapsed_ms: \d+ peer: \S+ server_uri: + \S+ failed` + +- Thread-specific logs will include the thread_id, helping to track performance + across threads. This test must be configurable via a few different command line flags: @@ -1057,6 +1069,14 @@ This test must be configurable via a few different command line flags: * `soak_min_time_ms_between_rpcs`: The minimum time in milliseconds between consecutive RPCs. Useful for limiting QPS. +* `soak_num_threads`: Specifies the number of threads to use for concurrently + executing the soak test. Each thread performs `soak_iterations / soak_num_threads` + RPCs. + +This value defaults to 1 (i.e., no concurrency) but can be + increased for concurrent execution. The total soak_iterations must be + divisible by soak_num_threads. + The following is optional but encouraged to improve debuggability: * Implementations should log the number of milliseconds that each RPC takes. @@ -1078,7 +1098,6 @@ included in that latency measurement (channel teardown semantics differ widely between languages). This latency measurement should also be the value that is logged and recorded in the latency histogram. - ### orca_per_rpc [orca_per_rpc]: #orca_per_rpc