This is a no-op, just reordering `requirements.lock`.
By providing `-r requirements.txt` to `pip freeze` it's able to break up
dependencies required via `requirements.txt`, and sub-dependencies
installed to satisfy them.
- Switched from yapf to black
- Reconfigure isort for black
- Resolve black/pylint idiosyncrasies
Note: I used `--experimental-string-processing` because black was
producing "implicit string concatenation", similar to what described
here: https://github.com/psf/black/issues/1837. While currently this
feature is experimental, it will be enabled by default:
https://github.com/psf/black/issues/2188. After running black with the
new string processing so that the generated code merges these `"hello" "
world"` strings concatenations, then I removed
`--experimental-string-processing` for stability, and regenerated the
code again.
To the reviewer: don't even try to open "Files Changed" tab 😄 It's
better to review commit-by-commit, and ignore `run black and isort`.
Improvements to the `LoadBalancerAccumulatedStatsRequest` output. Makes
it readable.
This greatly affects `assertRpcStatusCodes()` output, used in authz and
custom_lb.
No before and after stats, just useful diff stats from now. Minimal and
readable.
Also diff stats have `rpcs_started` now.
![image](https://github.com/grpc/grpc/assets/672669/a4e38d82-be5a-4f31-9d88-da2bf9712d9b)
Output example:
```
--- Starting subTest __main__.AuthzTest.test_plaintext_allow.01_host_wildcard ---
[psm-grpc-client-765bfbf868-jqjm7:51561] >> RPC LoadBalancerStatsService.GetClientAccumulatedStats(request=LoadBalancerAccumulatedStatsRequest({}), wait_for_ready=True, timeout=600)
[psm-grpc-client-765bfbf868-jqjm7:51561] >> RPC XdsUpdateClientConfigureService.Configure(request=ClientConfigureRequest({'types': ['EMPTY_CALL'], 'metadata': [{'key': 'test', 'value': 'host-wildcard'}]}), timeout=5, wait_for_ready=True)
[psm-grpc-client-765bfbf868-jqjm7:51561] >> RPC LoadBalancerStatsService.GetClientAccumulatedStats(request=LoadBalancerAccumulatedStatsRequest({}), wait_for_ready=True, timeout=600)
[psm-grpc-client-765bfbf868-jqjm7:51561] >> RPC LoadBalancerStatsService.GetClientAccumulatedStats(request=LoadBalancerAccumulatedStatsRequest({}), wait_for_ready=True, timeout=600)
[psm-grpc-client-765bfbf868-jqjm7] << Received accumulated stats difference. Expecting RPCs with status (0, OK) for method EMPTY_CALL.
- method: EMPTY_CALL
rpcs_started: 13
result:
(0, OK): 14
--- Finished subTest __main__.AuthzTest.test_plaintext_allow.01_host_wildcard ---
```
In case of test failure, it'll still print all stats at the end,
including before and after:
```
AssertionError: Expected only status (15, DATA_LOSS), but found status (0, OK) for method UNARY_CALL.
Stats before:
- method: UNARY_CALL
rpcs_started: 2153
result:
(14, UNAVAILABLE): 1674
(0, OK): 479
Stats after:
- method: UNARY_CALL
rpcs_started: 2404
result:
(0, OK): 730
(14, UNAVAILABLE): 1674
Diff stats:
- method: UNARY_CALL
rpcs_started: 251
result:
(0, OK): 251
```
And as I was at it, also made `LoadBalancerStatsResponse` nice:
![image](https://github.com/grpc/grpc/assets/672669/b15908a7-bae4-41a0-a2f7-c903e398432a)
- Fix broken `bin/run_channelz.py` helper
- Create `bin/run_ping_pong.py` helper that runs the baseline (aka
"ping_pong") test against preconfigured infra
- Setup automatic port forwarding when running `bin/run_channelz.py` and
`bin/run_ping_pong.py`
- Create `bin/cleanup_cluster.sh` helper to wipe xds out resources based
namespaces present on the cluster
Note: this involves a small change to the non-helper code, but it's just
moving a the part that makes XdsTestServer/XdsTestClient instance for a
given pod.
Our test clusters are in the broken/dirty state (except the URL map's
"basic" cluster, which isn't affected).
This PR switches to the newly created clusters to:
1. Get a data point on whether newly created clusters are affected
by the same issues.
2. Allow for descriptive work on the old clusters.
3. Hopefully, bring our tests back to green.
Bonus: more sensible cluster names.
- Added support for pod log collection. To enable, set `--collect_app_logs` flag, and specify `--log_dir`.
- Added support and helpers for operating on the `--log_dir` (natively provided by absl)
- Added support for `--follow` to `bin/run_test_server.py` and `bin/run_test_client.py` to follow pod logs printed to stdout
- Moved `PortForwarder` from k8s.py to its own file
The collection itself will be enabled per-suite in https://github.com/grpc/grpc/pull/30735.
- Changes the order of waiting for pods to start: wait for the pods first, then for the deployment to transition to active. This should provide more useful information in the logs, showing exactly why the pod didn't start, instead of generic "Replicas not available" ref b/200293121. This also needed for https://github.com/grpc/grpc/pull/30594
- Add support for `check_result` callback in the retryer helpers
- Completely replaces `retrying` with `tenacity`, ref b/200293121. Retrying is not longer maintained.
- Improves the readability of timeout errors: now they contain the timeout (or the attempt number) exceeded, and information why the timeout failed (exception/check function):
Before:
> `tenacity.RetryError: RetryError[<Future at 0x7f8ce156bc18 state=finished returned dict>]`
After:
> `framework.helpers.retryers.RetryError: Retry error calling framework.infrastructure.k8s.KubernetesNamespace.get_pod: timeout 0:01:00 exceeded. Check result callback returned False.`
- Improves the readability of the k8s wait operation errors: now the log includes colorized and formatted status of the k8s object being watched, instead of dumping the full k8s object. For example, here's how an error caused by using incorrect TD bootstrap image:
pod_name shouldn't be a part of the test app, it's purely k8s' idiom.
Originally server_id was intended for this purpose, but it was missed
when support for multiple server replicas added.
This replaces pod_name and server_id with hostname and improves
replica-specific log messages, so it's clear to what server
RPCs are issued.
In addition, now all RPC logs are annotated with the hostname:port,
so the destination is clear.
Before:
```
server_app.py:76] Setting health status to serving
grpc.py:60] RPC XdsUpdateHealthService.SetServing(request=Empty({}), timeout=90, wait_for_ready=True)
grpc.py:60] RPC Health.Check(request=HealthCheckRequest({}), timeout=90, wait_for_ready=True)
server_app.py:78] Server reports status: SERVING
```
After:
```
server_app.py:89] [psm-grpc-server-69bcf749c5-bg4x5] Setting health status to NOT_SERVING
grpc.py:72] [psm-grpc-server-69bcf749c5-bg4x5:52902] RPC XdsUpdateHealthService.SetNotServing(request=Empty({}), timeout=90, wait_for_ready=True)
grpc.py:72] [psm-grpc-server-69bcf749c5-bg4x5:52902] RPC Health.Check(request=HealthCheckRequest({}), timeout=90, wait_for_ready=True)
server_app.py:92] [psm-grpc-server-69bcf749c5-bg4x5] Health status status: NOT_SERVING
```
Similarly, this adds hostname to the client app, mainly for logging.
Separates xDS Test Client/Server (represent an interface to corresponding workload running remotely) from their runners (kubernetes-specific logic to provision the workloads with prerequisites).
This is a refactoring, should not change the behavior.
`kubernetes` library does not provide a way to configure the default socket timeout that will be used with `urllib3` it uses under the hood. And `urllib3` default socket timeout is infinity.
This PR sets the default socket timeout using python's `socket.setdefaulttimeout()` to 60 seconds.
This affects `urllib3` directly, and therefore `kubernetes`.
The changes is also picked up by the `google-api-python-client`, which does not use `urllib3` (it uses `httplib2`), but [respectes](https://googleapis.github.io/google-api-python-client/docs/epy/googleapiclient.http-module.html#build_http) `socket.setdefaulttimeout()`.
* [PSM Interop] Extend clean-up script to 2 other GKE clusters
* Use a safer apprach to invoke the cleanup script
* Handle the readonly issue
* Make sanity test happy
* Revert "Revert "[App Net] Switch Router to Mesh and Add unique string to Scope (#28145)" (#28176)"
This reverts commit cc968b2158.
* Allow scope to be None
* Add back references and scope field
* Set scope in router
* Reverse order of cleanup
* Add router_scope flag
* Use router_scope flag to create Router
* I apparently don't know how to brain
* Yapf
* Yeah, that can't be the default
* Remove debug print
* Remove impossible todos
* And another
* Switch from router-scope to config-scope
* Implement schema changes
* Use backend service URL
* Use CLH reference format to backend service
* I am an idiot
* *internal screaming*
* Try project number
* Why is this all awful
* Go back to trying project name
* Try cleaning things up
* Agh
* Address review comments
* Remove superfluous Optional type