This PR moves the PSM benchmark away from the benchmark-prod2
cluster to psm-benchmarks-performance cluster. This is to avoid
any unexpected change on the OSS benchmarks due to the tuning
we may perform for PSM benchmarks.
The psm-benchmarks-performance cluster has 2 system nodes,
1 driver-ci and 2 worker-ci nodes. The node type is the same as
the benchmark-prod2 cluster.
To capture the return status of the test in run_test the last command must be the call to the test itself.
This removes `set +x`, which makes the run_test always return success, and not propagate the test status.
I can't find it, but this exact error bit us before. Looks like it leaked to other scripts.
The good thing is if the test was executed, it's failure would still be picked up from the result xml.
However, if the test framework didn't start in the first place, the result will be false positive.
Example: https://source.cloud.google.com/results/invocations/98d3e679-ec8a-40bd-9f36-88179747b0d6/targets
```
/home/kbuilder/.pyenv/versions/k8s_xds_test_runner/bin/python3: Error while finding module specification for 'tests.authz_test' (ModuleNotFoundError: No module named 'tests')
+ set +x
Failed test suites: 0
[ID: 3548168] Command finished after 625 secs, exit value: 0
```
- Enables pod log collection in all PSM interop jobs implemented in https://github.com/grpc/grpc/pull/30594.
- Associate test suite runs with their own log file, so it's displayed on "Target Log" tab
Undoes https://github.com/grpc/grpc/pull/27096.
While we lost context why py tests were used pinned cpp server,
we think this is due to lack of support of the set_not_serving RPC
in the python server, see https://github.com/grpc/grpc/issues/30635.
This RPC is only used in two tests, and for them we added a
temporary override of the test server to the reference Java server,
see https://github.com/grpc/grpc/pull/30636.
All other LB tests should work with the python server just fine.
* Add enablePrometheus annotation.
The PR adds the enablePrometheus annotation to load tests that are
part of PSM data plan performance tests. This annotation enables
all PSM related tests to obtain data from Prometehus, even for the
regular tests.
This Addresses the issue with skips not working in golang tests, ref b/235688697.
1. Unifies `TESTING_VERSION` detection in grpc_xds_k8s_install_test_driver.sh - new approach applicable to all languages.
2. Use `TESTING_VERSION` in all build files in `--testing_version` and when tagging docker images. This will be backported to all active test branches. Build Scripts in all other languages will be updated as well.
* [fuzzer] Add a script to sample fuzzers
* remember the script
* add ci
* bleh
* fix
* Update sample_fuzzers.sh
* tweak
* tweak
* tweak
* tweak
* tweak
* fix fuzzer found bug
* add explainer
* make it bold af
* limit max fuzzing time in addition to runs
This commit adds experimental CI for PSM tests.
Test included initially are Java and cxx tests. With expected load
from 100 to 30000, same as the manual test. Running proxied and proxyless
tests on benchmark-prod2 cluster, 8-core-machines (both client and server).
Data are uploaded to experimental tables.
Open for initial feedback on what we want to run on CI.
* add release version of grpc_distribtests_* jobs .cfg files
* make grpc_distribtests_ruby more aligned with other single-job distribtests
* dont hide packages from build_artifacts step for python an php
* unify DOCKER_TTY_ARGS in docker scripts
* improvements and cleanup in build_and_run_docker.sh
* fix shellcheck
* make sure python sdist artifact is readable
* cleanup bazel_rbe .cfg and .sh files
* upload sponge_log.xml artifacts for selected bazel jobs
* use move_src_tree_and_respawn_itself_rc for bazel RBE tests on linux
* fix wrong config
* Initial GCF distribtest
* Tenatively hook up to CI
* Try again
* Allow dev0 artifacts
* Fix invocation path
* Update gcloud
* Add a 3.8 artifact for presubmits
* And 3.9 too
* Put test files back to normal
* Formatting/linting
* Copyright
* That copyright script doesn't work with shebangs
* Review comments
* Try to create latest-manylinux label
* Accidentally a letter
* Add Python 3.7 manylinux 2014 to presubmit
* Revert CI config file used for test
* Review comments
* Yapf
* Re-add presubmit wheel
* Review comments
* GKE benchmarks: add support for benchmarking grpc-dotnet
* add dotnet to loadtest basic templates
* print full path for generated examples
* add grpc-dotnet scenario to loadtest_example.sh generator
* add grpc-dotnet to the experimental kokoro job
* yapf format code
* refactor RBE configs
* better naming for linux specific RBE configs
* update names of RBE configs elsewhere
* move partial configs to tools/remote_build/include