b/228743575 started happening more frequently and there's more important
fires to worry about. Silence this off-by-one flake to let us to come
back to it when we have a bit more time.
All alternative server runners except the failover test reuse the primary server runners' namespace. Failover test is using the secondary cluster, and manages its own namespace there. `reuse_namespace` disables namespace cleanup, and in this case it was set to `True` incorrectly.
* Enable outlier detection k8s interop test for Java. (#30641)
* xDS interop: enable outlier detection Java tests in >= 1.49.x
Co-authored-by: Terry Wilson <terrymwilson@gmail.com>
pod_name shouldn't be a part of the test app, it's purely k8s' idiom.
Originally server_id was intended for this purpose, but it was missed
when support for multiple server replicas added.
This replaces pod_name and server_id with hostname and improves
replica-specific log messages, so it's clear to what server
RPCs are issued.
In addition, now all RPC logs are annotated with the hostname:port,
so the destination is clear.
Before:
```
server_app.py:76] Setting health status to serving
grpc.py:60] RPC XdsUpdateHealthService.SetServing(request=Empty({}), timeout=90, wait_for_ready=True)
grpc.py:60] RPC Health.Check(request=HealthCheckRequest({}), timeout=90, wait_for_ready=True)
server_app.py:78] Server reports status: SERVING
```
After:
```
server_app.py:89] [psm-grpc-server-69bcf749c5-bg4x5] Setting health status to NOT_SERVING
grpc.py:72] [psm-grpc-server-69bcf749c5-bg4x5:52902] RPC XdsUpdateHealthService.SetNotServing(request=Empty({}), timeout=90, wait_for_ready=True)
grpc.py:72] [psm-grpc-server-69bcf749c5-bg4x5:52902] RPC Health.Check(request=HealthCheckRequest({}), timeout=90, wait_for_ready=True)
server_app.py:92] [psm-grpc-server-69bcf749c5-bg4x5] Health status status: NOT_SERVING
```
Similarly, this adds hostname to the client app, mainly for logging.
In python tests that require set_not_serving server RPC, override
the python server with the reference server (Java) because
the python server doesn't yet support set_not_serving RPC.
Ref https://github.com/grpc/grpc/issues/30635.
Separates xDS Test Client/Server (represent an interface to corresponding workload running remotely) from their runners (kubernetes-specific logic to provision the workloads with prerequisites).
This is a refactoring, should not change the behavior.
Some tests override unittest's `tearDown()`, which is not wrong, but less resilient than overriding custom `cleanup()` that is being retried in framework's `tearDown()`.
- xDS interop: add support for the reference xds test server
- Set default xDS test server reference to Java `v1.48.1`
- Override xDS test server with the reference in Outlier Detection
* Add xDS interop test for outlier detection
This implements the test described in #29623, and plumbing for setting the
outlierDetection field in the backend service config. The changes in this PR
are very similar to #29688.
* Fix use of configure method
* Correct copy/paste error
* Fix metadata configuration syntax
* Increase QPS, use just one method
* Format code
* Apply suggestions from code review
Co-authored-by: Sergii Tkachenko <hi@sergii.org>
* Address review comments
* Only Java implements the required server features
* Automated change: Fix sanity tests
* Address review comments
* Use double quotes for docstring
Co-authored-by: Sergii Tkachenko <hi@sergii.org>
Co-authored-by: Sergii Tkachenko <hi@sergii.org>
Co-authored-by: murgatroid99 <murgatroid99@users.noreply.github.com>
Resume the failover test. For now, just on master. Will be resumed on other branches, when the fix is backported.
At the moment, the master is fixed in java and go.
ref b/238226704
Added a couple of tests which run the baseline_test with all released
bootstrap generator versions on client and server. These tests will be
run on a continuous integration environment with gRPC servers and
clients built using the latest released version of gRPC in one selected
language.
* Add supported Node version ranges in xDS k8s url_map tests
This adds is_supported implementations for most of the url_map tests that didn't
already have them. The exception is metadata_filter_test because it doesn't use
any specific client features.
* Fix formatting
* Improve timeout test check order
1. Fixes the issue with Java PSM security tests accidentally skipped because Java was missing from the list of languages, ref https://github.com/grpc/grpc/pull/28978
2. Invert the logic of `is_supported` methods, making them normally open
3. Make languages an `enum.Flag` to avoid accidental typos when listing the languages
4. Rename `XdsKubernetesTestCase.isSupported` to `XdsKubernetesTestCase.is_supported` to be consistent with `XdsUrlMapTestCase.is_supported`
5. Add extra logging
* [PSM interop] Expand the support of test config validation
* Comment the usage and source of testing_version
* Also include the comment for url-map tests
* Revert "Revert "Add api listener test for k8s (#27534)" (#28719)"
This reverts commit c35b93f28d.
Fix parsing logic of the RDS response from CSDS to support different response formats. Use common parsing logics from url_map in this test case for parsing.
* Revert "Revert "[App Net] Switch Router to Mesh and Add unique string to Scope (#28145)" (#28176)"
This reverts commit cc968b2158.
* Allow scope to be None
The authz test flaked as no RPCs of the expected type had completed
within the sampling window. Server logs showed authz logs completing
batch of 276 RPCs back-to-back, without the expected 40 ms separation
(qps=25). It took a bit over 1 second to process through the backlog.
With the sample duration of 500 ms and there being a polling delay
between when the channel is READY and when the test driver polls
channelz, it makes sense that we can get lucky much of the time.
Obviously, adding a sleep isn't great either, but measuring the queue
length indirectly is more complex than really appropriate here. The real
solution is to stop using this continuous-qps test client.
```
Traceback (most recent call last):
File "/tmp/work/grpc/tools/run_tests/xds_k8s_test_driver/tests/authz_test.py", line 252, in test_tls_allow
grpc.StatusCode.OK)
File "/tmp/work/grpc/tools/run_tests/xds_k8s_test_driver/tests/authz_test.py", line 183, in configure_and_assert
method=rpc_type)
File "/tmp/work/grpc/tools/run_tests/xds_k8s_test_driver/framework/xds_k8s_testcase.py", line 284, in assertRpcStatusCodes
self.assertGreater(stats.result[status_code.value[0]], 0)
AssertionError: 0 not greater than 0
```
* Add back references and scope field
* Set scope in router
* Reverse order of cleanup
* Add router_scope flag
* Use router_scope flag to create Router
* I apparently don't know how to brain
* Yapf
* Yeah, that can't be the default
* Remove debug print
* Remove impossible todos
* And another
* Switch from router-scope to config-scope
* Implement schema changes
* Use backend service URL
* Use CLH reference format to backend service
* I am an idiot
* *internal screaming*
* Try project number
* Why is this all awful
* Go back to trying project name
* Try cleaning things up
* Agh
* Address review comments
* Remove superfluous Optional type
A broken fix for the server-side bug is producing invalid configuration,
causing the server to reject all the configuration. Disable the
configuration and tests until fix is fixed.
The control plane was updated to more properly match the principal being
present, so now plaintext and mTLS are working properly. But the change
is using slightly the wrong semantics for TLS, so we get to change which
tests are commented out.