* [cleanup] Remove profiling timers
- nobody has used this system in years
- if we needed it, we'd probably rewrite it at this point to be something more modern
- let's remove it until that need arises
* fix
* fixes
* Add a provision to allow specification of separate set of channel args for the grpclb channel
* fix asan issue
* review comments
* review comments
* add missing file
* remove unused hdr
* fix sanity
* fix comments
* remove unused hdr
To capture the return status of the test in run_test the last command must be the call to the test itself.
This removes `set +x`, which makes the run_test always return success, and not propagate the test status.
I can't find it, but this exact error bit us before. Looks like it leaked to other scripts.
The good thing is if the test was executed, it's failure would still be picked up from the result xml.
However, if the test framework didn't start in the first place, the result will be false positive.
Example: https://source.cloud.google.com/results/invocations/98d3e679-ec8a-40bd-9f36-88179747b0d6/targets
```
/home/kbuilder/.pyenv/versions/k8s_xds_test_runner/bin/python3: Error while finding module specification for 'tests.authz_test' (ModuleNotFoundError: No module named 'tests')
+ set +x
Failed test suites: 0
[ID: 3548168] Command finished after 625 secs, exit value: 0
```
* ConnectionAttemptInjector: fix tsan failures
* Add hold to test
* Add hold to test
* Address comments
* Address comments
* Address comments
* Fix typo
Co-authored-by: Mark D. Roth <roth@google.com>
When we use retryers with `log_level=logging.INFO`, tenacity logs the result value (or an exception) after each unsuccessful retry attempt.
We often retry methods that return objects, resulting in unreadable log messages:
```
I0820 03:16:29.027635 140613877811008 before_sleep.py:45] Retrying framework.xds_k8s_testcase.IsolatedXdsKubernetesTestCase.cleanup in 10.0 seconds as it raised RetryError: RetryError[Attempts: 21, Value: {'api_version': 'v1',
'kind': 'Namespace',
'metadata': {'annotations': None,
'cluster_name': None,
'creation_timestamp': datetime.datetime(2022, 8, 20, 2, 55, 32, tzinfo=tzlocal()),
'deletion_grace_period_seconds': None,
'deletion_timestamp': datetime.datetime(2022, 8, 20, 3, 6, 27, tzinfo=tzlocal()),
'finalizers': None,
'generate_name': None,
'generation': None,
'labels': {'kubernetes.io/metadata.name': 'psm-interop-server-20220820-0253-yrmam',
'name': 'psm-interop-server-20220820-0253-yrmam',
'owner': 'xds-k8s-interop-test'},
'managed_fields': [{'api_version': 'v1',
'fields_type': 'FieldsV1',
'fields_v1': {'f:metadata': {'f:labels': {'.': {},
'f:kubernetes.io/metadata.name': {},
... (82 more lines)
```
This PR introduces custom `before_sleep` logger, that only logs the value if it's a primitive: `int, str, bool`.
Otherwise, it logs the type, example:
```
k8s_base_runner.py:311] Waiting for pod psm-grpc-client-5d5648478f-7vsf7 to start
retryers.py:192] Retrying framework.infrastructure.k8s.KubernetesNamespace.get_pod in 1.0 seconds as it returned type <class 'kubernetes.client.models.v1_pod.V1Pod'>.
retryers.py:192] Retrying framework.infrastructure.k8s.KubernetesNamespace.get_pod in 1.0 seconds as it returned type <class 'kubernetes.client.models.v1_pod.V1Pod'>.
```
Note that this only changes the behavior of the unsuccessful retries, and doesn't affect the new feature that prints formatted k8s status field on if the *final* retry attempt failed.
- Enables pod log collection in all PSM interop jobs implemented in https://github.com/grpc/grpc/pull/30594.
- Associate test suite runs with their own log file, so it's displayed on "Target Log" tab
- Added support for pod log collection. To enable, set `--collect_app_logs` flag, and specify `--log_dir`.
- Added support and helpers for operating on the `--log_dir` (natively provided by absl)
- Added support for `--follow` to `bin/run_test_server.py` and `bin/run_test_client.py` to follow pod logs printed to stdout
- Moved `PortForwarder` from k8s.py to its own file
The collection itself will be enabled per-suite in https://github.com/grpc/grpc/pull/30735.
* xDS interop: Fix default resource prefix
No longer just security tests.
This is done to avoid confusion when debugging resources managed
by the LB tests.
* s/xds/psm