This PR moves the PSM benchmark away from the benchmark-prod2
cluster to psm-benchmarks-performance cluster. This is to avoid
any unexpected change on the OSS benchmarks due to the tuning
we may perform for PSM benchmarks.
The psm-benchmarks-performance cluster has 2 system nodes,
1 driver-ci and 2 worker-ci nodes. The node type is the same as
the benchmark-prod2 cluster.
* Support Python 3.11
* Update build images for 3.11
* Whoopsie
* The architecture of this thing is garbage
* Silence ownership warning
* Account for change in git behavior
* Fix directory
* I am in great pain
* Update Windows and arm linux
* Agh
* Clean up
To capture the return status of the test in run_test the last command must be the call to the test itself.
This removes `set +x`, which makes the run_test always return success, and not propagate the test status.
I can't find it, but this exact error bit us before. Looks like it leaked to other scripts.
The good thing is if the test was executed, it's failure would still be picked up from the result xml.
However, if the test framework didn't start in the first place, the result will be false positive.
Example: https://source.cloud.google.com/results/invocations/98d3e679-ec8a-40bd-9f36-88179747b0d6/targets
```
/home/kbuilder/.pyenv/versions/k8s_xds_test_runner/bin/python3: Error while finding module specification for 'tests.authz_test' (ModuleNotFoundError: No module named 'tests')
+ set +x
Failed test suites: 0
[ID: 3548168] Command finished after 625 secs, exit value: 0
```
- Enables pod log collection in all PSM interop jobs implemented in https://github.com/grpc/grpc/pull/30594.
- Associate test suite runs with their own log file, so it's displayed on "Target Log" tab
Undoes https://github.com/grpc/grpc/pull/27096.
While we lost context why py tests were used pinned cpp server,
we think this is due to lack of support of the set_not_serving RPC
in the python server, see https://github.com/grpc/grpc/issues/30635.
This RPC is only used in two tests, and for them we added a
temporary override of the test server to the reference Java server,
see https://github.com/grpc/grpc/pull/30636.
All other LB tests should work with the python server just fine.
* Add enablePrometheus annotation.
The PR adds the enablePrometheus annotation to load tests that are
part of PSM data plan performance tests. This annotation enables
all PSM related tests to obtain data from Prometehus, even for the
regular tests.
This Addresses the issue with skips not working in golang tests, ref b/235688697.
1. Unifies `TESTING_VERSION` detection in grpc_xds_k8s_install_test_driver.sh - new approach applicable to all languages.
2. Use `TESTING_VERSION` in all build files in `--testing_version` and when tagging docker images. This will be backported to all active test branches. Build Scripts in all other languages will be updated as well.
* Attempt to set correct platform on Mac OS
* Add some debug
* Make it fail
* Print more
* Try again
* Maybe it's an ordering issue?
* Get logs back
* Try copying distutils to see exactly what is being used
* Actually export the variable
* I just love debugging with CI
* One directory higher this time
* Try with an upgraded Python install
* Fix version
* Rebreak
* Try setting it even earlier?
* Unbreak
* Try explicitly renaming the artifacts
* Fix
* I am about ready to start NAT hole punching for SSH
* Break things for logs
* Whoops
* Clean up
* Shellcheck
* unit tests with bazel
* passing via --test_env from bazel command line
* remove env from BUILD; fix sanity check in run_one_test_bazel.sh
* add port server
* [fuzzer] Add a script to sample fuzzers
* remember the script
* add ci
* bleh
* fix
* Update sample_fuzzers.sh
* tweak
* tweak
* tweak
* tweak
* tweak
* fix fuzzer found bug
* add explainer
* make it bold af
* limit max fuzzing time in addition to runs
This commit adds experimental CI for PSM tests.
Test included initially are Java and cxx tests. With expected load
from 100 to 30000, same as the manual test. Running proxied and proxyless
tests on benchmark-prod2 cluster, 8-core-machines (both client and server).
Data are uploaded to experimental tables.
Open for initial feedback on what we want to run on CI.