interop-testing: update the Interop-test-descriptions doc to reflect the soak concurrency (#38126)

- Update the Interop-test-descriptions doc to reflect the concurrency improvement in the rpc_soak and channel_soak tests. - PTAL @apolcyn Closes #38126 COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/38126 from zbilun:soak-doc-update e8ae5ae9a6 PiperOrigin-RevId: 699251174
6 days ago · 7826ddca68
parent c6dccd4cb1
commit 7826ddca68
1 changed files with 29 additions and 10 deletions
--- a/doc/interop-test-descriptions.md
+++ b/doc/interop-test-descriptions.md
@ -1005,14 +1005,21 @@ Client asserts:
 ### rpc_soak

 The client performs many large_unary RPCs in sequence over the same channel.
-The client records the latency and status of each RPC in some data structure.
-If the test ever consumes `soak_overall_timeout_seconds` seconds and still hasn't
-completed `soak_iterations` RPCs, then the test should discontinue sending RPCs
-as soon as possible. After performing all RPCs, the test should examine
-previously recorded RPC latency and status results in a second pass and fail if
-either:
+The total number of RPCs to execute is controlled by the `soak_iterations` 
+parameter, which defaults to 10. The number of threads used to execute RPCs 
+is controlled by `soak_num_threads`. By default, `soak_num_threads` is set to 1. 

-a) not all `soak_iterations` RPCs were completed
+The client records the latency and status of each RPC in 
+thread-specific data structure, which are later aggregated to form the overall 
+results. If the test ever consumes `soak_overall_timeout_seconds` seconds 
+and still hasn't completed `soak_iterations` RPCs, then the test should 
+discontinue sending RPCs as soon as possible. Each thread should independently 
+track its progress and stop once the overall timeout is reached.
+
+After performing all RPCs, the test should examine the previously aggregated RPC
+latency and status results from all threads in a second pass and fail if either:
+
+a) not all `soak_iterations` RPCs were completed across all threads

 b) the sum of RPCs that either completed with a non-OK status or exceeded
   `max_acceptable_per_rpc_latency_ms` exceeds `soak_max_failures`
@ -1029,10 +1036,15 @@ results of each iteration (i.e. RPC) in a format the matches the following
 regexes:

 - Upon success:
-  - `soak iteration: \d+ elapsed_ms: \d+ peer: \S+ succeeded`
+  - `thread_id: \d+ soak iteration: \d+ elapsed_ms: \d+ peer: \S+ server_uri: 
+  \S+ succeeded`

 - Upon failure:
-  - `soak iteration: \d+ elapsed_ms: \d+ peer: \S+ failed:`
+  - `thread_id: \d+ soak iteration: \d+ elapsed_ms: \d+ peer: \S+ server_uri: 
+  \S+ failed`
+
+- Thread-specific logs will include the thread_id, helping to track performance
+  across threads.

 This test must be configurable via a few different command line flags:

@ -1057,6 +1069,14 @@ This test must be configurable via a few different command line flags:
 * `soak_min_time_ms_between_rpcs`: The minimum time in milliseconds between
  consecutive RPCs. Useful for limiting QPS.

+* `soak_num_threads`: Specifies the number of threads to use for concurrently 
+  executing the soak test. Each thread performs `soak_iterations / soak_num_threads`
+  RPCs.
+
+This value defaults to 1 (i.e., no concurrency) but can be 
+  increased for concurrent execution. The total soak_iterations must be 
+  divisible by soak_num_threads.
+
 The following is optional but encouraged to improve debuggability:

 * Implementations should log the number of milliseconds that each RPC takes.
@ -1078,7 +1098,6 @@ included in that latency measurement (channel teardown semantics differ widely
 between languages). This latency measurement should also be the value that is
 logged and recorded in the latency histogram.

-
 ### orca_per_rpc
 [orca_per_rpc]: #orca_per_rpc