Chiebot-Mirror/grpc - grpc - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Sergii Tkachenko	22682a78f6	[PSM Interop] Delete PSM interop source per new repo migration (#35466 ) New source of truth: https://github.com/grpc/psm-interop. This PR removes PSM Interop framework source code from `tools/run_tests/xds_k8s_test_driver`, and all references to it. Closes #35466 PiperOrigin-RevId: 597636949	11 months ago
Stanley Cheung	9e702debfb	[PSM Interop] Add support to enable CSM Observability and a new test case (#34835 ) This PR adds CSM Observability testing capability in the PSM Interop testing framework. This PR mostly changes the framework Python code. This adds a flag `enable_csm_observability` to the client / server deployment yaml file such that, when enabled, we will create a GMP `PodMonitoring` resource and pass the `--enable_csm_observability` to each language's client / server container (for them to actually enable the Prometheus endpoint) I added a new test under `tests/csm/csm_observability_test.py`. This is basically a copy of the `tests/baseline_test.py` but with the `enable_csm_observability=True`. Other PRs for this whole thing to work: - https://github.com/grpc/grpc/pull/34752: The `PodMonitoring` resource yaml template - https://github.com/grpc/grpc/pull/34832: Support for the `--enable_csm_observability` flag in the C++ client/server image Closes #34835 COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/34835 from stanley-cheung:csm-o11y-framework-changes `0b3d0eb7ed` PiperOrigin-RevId: 595502496	11 months ago
Yash Tibrewal	1c96d533af	[PSM Interop] Update logic to detect failed ADS channels (#35280 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. --> Closes #35280 COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35280 from yashykt:UpdateInteropScriptForFindingAdsChannel `db213384b4` PiperOrigin-RevId: 591090750	12 months ago
Sergii Tkachenko	b570879010	[PSM Interop] Log on debug level when resource deletion failure is 404 (#35131 ) Removes noise from the cleanup/teardown ops. #### GCP APIs In GCP APIs, change log level for delete operations that failed because the resource doesn't exist (API 404) from `info` to `debug`. Framework's logging philosophy is to only log external operations (e.g. APIs, RPCs). If no error logged, the op is assumed successful. In the deletion case, is still possible to discriminate between whether the op was actually performed by observing the `Waiting %s sec for %s operation id: %s` log message. #### K8s APIs In K8s APIs: - For delete operations that failed because the resource doesn't exist (API 404) the log level is changed from `info` to `debug` - For delete operations that failed for any other reason, the log level is changed from `info` to `warning` - When `wait_for_deletion` is enabled (it's the default) the delete operation will be confirmed with `logger.info("<resource_kind> %s deleted", name)`. Previously it logged at the `debug` level. Closes #35131 COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35131 from sergiitk:psm-interop-debug-log-on-delete-404 `f6629e5132` PiperOrigin-RevId: 587851692	1 year ago
Sergii Tkachenko	5281789883	[PSM Interop] Check for active ADS in Security and URL Map tests (#34968 ) `test_client.wait_for_server_channel_ready` was not called in `SecurityXdsKubernetesTestCase` and `XdsUrlMapTestCase`. Initial PR: https://github.com/grpc/grpc/pull/34631. Closes #34968 COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/34968 from sergiitk:psm-interop-secure-tests-ads `a71b422c1c` PiperOrigin-RevId: 583180208	1 year ago
Sergii Tkachenko	434a5c8e72	[PMS Interop] ADS detection: minor wording change (#34895 ) Now we log pretty much identical message: ``` client_app.py:320] [psm-grpc-client-7768f6597-nvtgl] Detected successful calls to xDS control plane: trafficdirector.googleapis.com:443 client_app.py:292] [psm-grpc-client-7768f6597-nvtgl] ADS: Detected successful calls to xDS control plane trafficdirector.googleapis.com:443 ``` This PR will log the latest channel state in the first message, similar to what we do in `find_server_channel_with_state`: `52c08f4498/tools/run_tests/xds_k8s_test_driver/framework/test_app/client_app.py (L367-L371)` After the change: ``` client_app.py:320] [psm-grpc-client-6566595cff-8wrfd] Detected successful calls to xDS control plane trafficdirector.googleapis.com:443, channel: <Channel channel_id=4 target=trafficdirector.googleapis.com:443 call_started=9 calls_failed=8 state=READY> client_app.py:292] [psm-grpc-client-6566595cff-8wrfd] ADS: Detected successful calls to xDS control plane trafficdirector.googleapis.com:443 ```	1 year ago
Stanley Cheung	2091d31ccf	[PSM Interop] Add PodMonitoring resource to psm xds interop testing framework (#34752 ) Add a `PodMonitoring` resource type to the PSM interop testing framework. This is needed so that GMP (Google Managed Prometheus) can scrape the matching GKE pods Prometheus endpoint for Prometheus metrics.	1 year ago
Xuan Wang	341c52e562	[PSM Interop] Changing xds server to xDS control plane in error message (#34808 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	1 year ago
Xuan Wang	942e2b1dfd	[PSM Interop] Log Google Cloud API debug header part 3 (#34755 ) In case of test fails, the clean up script will try delete some resource we didn't create and resulting lots of 404 errors, we should exclude those status code since we have specific handling for 404. <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	1 year ago
Xuan Wang	b16fa809e9	[PSM Interop] Add a step to wait for active XDS channel when start test client. (#34631 ) * Logs when XDS channel check passed: ``` I1010 22:53:35.013700 140608769881920 client_app.py:278] [psm-grpc-client-9b5756c77-4gv6d] Waiting to report an active channel to trafficdirector.googleapis.com:443 I1010 22:53:38.879174 140608769881920 client_app.py:306] [psm-grpc-client-9b5756c77-4gv6d] xds channel: <Channel channel_id=10 target=trafficdirector.googleapis.com:443 call_started=2 calls_failed=2 state=READY> I1010 22:53:49.002596 140608769881920 client_app.py:306] [psm-grpc-client-9b5756c77-4gv6d] xds channel: <Channel channel_id=10 target=trafficdirector.googleapis.com:443 call_started=5 calls_failed=5 state=READY> I1010 22:53:59.130141 140608769881920 client_app.py:306] [psm-grpc-client-9b5756c77-4gv6d] xds channel: <Channel channel_id=10 target=trafficdirector.googleapis.com:443 call_started=6 calls_failed=6 state=READY> I1010 22:54:09.253418 140608769881920 client_app.py:306] [psm-grpc-client-9b5756c77-4gv6d] xds channel: <Channel channel_id=10 target=trafficdirector.googleapis.com:443 call_started=7 calls_failed=7 state=READY> I1010 22:54:19.386313 140608769881920 client_app.py:306] [psm-grpc-client-9b5756c77-4gv6d] xds channel: <Channel channel_id=10 target=trafficdirector.googleapis.com:443 call_started=8 calls_failed=8 state=READY> I1010 22:54:35.517963 140608769881920 client_app.py:306] [psm-grpc-client-9b5756c77-4gv6d] xds channel: <Channel channel_id=10 target=trafficdirector.googleapis.com:443 call_started=8 calls_failed=8 state=READY> I1010 22:55:00.638522 140608769881920 client_app.py:306] [psm-grpc-client-9b5756c77-4gv6d] xds channel: <Channel channel_id=10 target=trafficdirector.googleapis.com:443 call_started=10 calls_failed=8 state=READY> I1010 22:55:00.638787 140608769881920 client_app.py:314] [psm-grpc-client-9b5756c77-4gv6d] Found an active XDS channel I1010 22:55:00.638983 140608769881920 client_app.py:288] [psm-grpc-client-9b5756c77-4gv6d] Channel to trafficdirector.googleapis.com:443 transitioned to active I1010 22:55:00.639290 140608769881920 client_app.py:240] [psm-grpc-client-9b5756c77-4gv6d] Waiting to report a READY channel to xds:///psm-grpc-server:8080 I1010 22:55:00.749331 140608769881920 client_app.py:347] [psm-grpc-client-9b5756c77-4gv6d] Server channel: <Channel channel_id=5 target=xds:///psm-grpc-server:8080 call_started=2215 calls_failed=1753 state=READY> ``` * Error Message when XDS channel check failed: ``` E1010 23:09:15.111581 140016347227968 base_testcase.py:60] ERROR Traceback in __main__.BaselineTest.test_traffic_director_grpc_setup: Traceback (most recent call last): File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/tests/baseline_test.py", line 53, in test_traffic_director_grpc_setup test_client: _XdsTestClient = self.startTestClient(test_server) File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/framework/xds_k8s_testcase.py", line 787, in startTestClient return self._start_test_client(test_server.xds_uri, **kwargs) File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/framework/xds_k8s_testcase.py", line 798, in _start_test_client test_client.wait_for_active_xds_channel( File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/framework/test_app/client_app.py", line 171, in wait_for_active_xds_channel return self.wait_for_xds_channel_active( File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/framework/test_app/client_app.py", line 283, in wait_for_xds_channel_active channel = retryer( File "/usr/local/google/home/xuanwn/.pyenv/versions/310xds/lib/python3.10/site-packages/tenacity/__init__.py", line 423, in __call__ do = self.iter(retry_state=retry_state) File "/usr/local/google/home/xuanwn/.pyenv/versions/310xds/lib/python3.10/site-packages/tenacity/__init__.py", line 369, in iter return self.retry_error_callback(retry_state=retry_state) File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/framework/helpers/retryers.py", line 141, in error_handler raise RetryError( framework.helpers.retryers.RetryError: Retry error calling framework.test_app.client_app.XdsTestClient.find_active_xds_channel: timeout 0:05:00 (h:mm:ss) exceeded. Last exception: ChannelNotActive: [psm-grpc-client-755fc5b468-qkh22] Client has no active channel with xds server trafficdirector.googleapis.com:443 ``` <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. --> --------- Co-authored-by: Sergii Tkachenko <hi@sergii.org>	1 year ago
Sergii Tkachenko	6d741ca724	[PSM Interop] README.md typo fix (#34576 ) `baseline tests` should've been an anchor, not a directory.	1 year ago
Sergii Tkachenko	0cdcb2a2bf	[PSM Interop] Support --noenable_workload_identity in helper scripts (#34619 ) Example: ``` ./run.sh bin/run_test_client.py --noenable_workload_identity ```	1 year ago
Sergii Tkachenko	1c4da38d40	[PSM Interop] New cleanup script (#33460 ) 1. Changes the resource retention period to 2 days for all resources (previously 7 days for TD resources, 6 hours for k8s). This solved a problem with k8s resources being stuck because corresponding TD resources weren't deleted. 2. Resume on namespace cleanup failures 3. Add secondary lb cluster cleanup logic 4. Modularize `grpc_xds_resource_cleanup.sh` 5. Make `KubernetesNamespace`'s methods `pretty_format_status` and `pretty_format_metadata` public 6. `pretty_format_status`: also print resource kind, creation and deletion requested dates ref b/259724370, cl/517235715	1 year ago
Xuan Wang	997c73a6a4	[PSM Interop] Log Google Cloud API debug header part 2 (#34687 ) We're seeing too many debug headers, change it to only log header in case of error. <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	1 year ago
Sergii Tkachenko	724e12a1c7	[PSM Interop] Fix the 10-minute-teardown issue in GAMMA tests (#34560 ) We shouldn't just set `termination_grace_period_seconds=600` by default for all gamma tests extending `GammaXdsKubernetesTestCase`. This is what's causing the deployment deletion issue: > `framework.helpers.retryers.RetryError: Retry error calling framework.xds_k8s_testcase.IsolatedXdsKubernetesTestCase.cleanup: 1 attempts exhausted. Last exception: RetryError: Retry error calling framework.infrastructure.k8s.KubernetesNamespace.get_deployment: timeout 0:05:00 (h:mm:ss) exceeded. Check result callback returned False.` We wait for 5 minutes, while the deployment is happily handing for 10. Then the second cleanup retry kills it - but not before waiting for another 5 minutes. I think `self.force = False` may be solving another issue triggered by the get_deployment retry timeout: because we start over deleting the resources by name and some of them are deleted from the first attempt we get 404. And I'm pretty sure we don't do error-handling correctly when deleting CRD-based resources - which cascades into even more unnecessary retries.	1 year ago
Sergii Tkachenko	96f36b6991	[PSM Interop] Increase CSM NEG annotation timeout; improve wording (#34555 ) - GAMMA server runner: increase the wait time for the NEG annotation from 1 minute to 3. - Improve the wording around the wait for NEG methods to make it clear this is the annotation we're waiting for - so it's not confused with getting the NEG health from the GCP APIs. ref b/298501683, b/302723651	1 year ago
Xuan Wang	cc07942993	[PSM Interop] Log Google Cloud API debug header (#34334 ) Add Google Cloud API debug header and read response. #### Testing * Triggered in Prod, no header was logged: https://source.cloud.google.com/results/invocations/c91c6e81-5e5b-4545-8abf-28594d4493a5 * Triggered in Prod with debug header enabled: https://source.cloud.google.com/results/invocations/cfda4dd5-e3a9-4519-a4e7-85220f780cbf * Also tested locally in staging. * Example log: ``` I0913 01:04:19.688863 139651983066944 traffic_director.py:187] Creating GRPC Health Check "xds-k8s-test-health-check-dev" I0913 01:04:19.692854 139651983066944 compute.py:538] Creating compute resource: --- grpcHealthCheck: portSpecification: USE_SERVING_PORT name: xds-k8s-test-health-check-dev type: GRPC ... I0913 01:04:19.693941 139651983066944 compute.py:600] Adding debug headers for method: compute.healthChecks.insert I0913 01:04:20.726086 139651983066944 compute.py:595] Received debug headers: ADViAh6vFBtRL4YC... I0913 01:04:20.726562 139651983066944 compute.py:610] Waiting 600 sec for compute operation id: operation-1694567059990-6053323a8a81a-5340529e-bf5d5c03 I0913 01:04:23.383233 139651983066944 xds_k8s_testcase.py:272] --- Finished subTest __main__.BaselineTest.test_traffic_director_grpc_setup.0_create_health_check --- ```	1 year ago
Sergii Tkachenko	cf0d1094a1	[PSM Interop] Double the active channel timeout for GAMMA tests (#34511 ) Waiting for an active channel takes less time in non-gamma test suites because they only start waiting after already waited for the TD backends to be created and report healthy. In GAMMA, these resources are created asynchronously by Kubernetes. To compensate for this, we double the timeout for GAMMA tests.	1 year ago
Arvind Bright	637d8bd5d5	[PSM Interop] Remove --td_bootstrap_image override from csm config (#34509 ) Since we have released `7d8d90477792e2e1bfe3a3da20b3dc9ef01d326c` in https://github.com/GoogleCloudPlatform/traffic-director-grpc-bootstrap/releases/tag/v0.15.0. So we dont need the csm tests to use an overridden version. This diff nixes the override flag.	1 year ago
Arvind Bright	f6a60253f6	[PSM Interop] update td bootstrap generator image for prod test (#34456 ) This change is to update the TD bootstrap generator for prod tests. This is part of the TD release process. The new image has already been merged to staging and tested locally in google3. cc: @sergiitk PTAL.	1 year ago
Sergii Tkachenko	3f9417a775	[PSM Interop] SSA: Remove failfast=True (#34478 ) Was never needed in the first place.	1 year ago
Richard Belleville	98704e7e4d	[Interop] Re-add SSA env var flip until safe to remove (#34460 )	1 year ago
Sergii Tkachenko	8d3914f774	[PSM Interop] Use newer bootstrap in CSM tests (#34457 )	1 year ago
Sergii Tkachenko	c76730114a	[PSM Interop] Rename GAMMA tests to CSM tests (#34452 ) Notes: - GAMMA tests are now a subset of a wider CSM scope - Some "gamma" files to ensure graceful renames - Should be safe to merge as is	1 year ago
Eugene Ostroukhov	2f78fffa37	[xds ssa] Remove environment variable protection for stateful affinity (#34435 )	1 year ago
Richard Belleville	24420100bb	[PSM Interop] Collect metadata in appnet ssa tests (#34406 ) Follow-up fix to https://github.com/grpc/grpc/pull/34387	1 year ago
Richard Belleville	62521a889f	[Interop] Tests for SSA and GAMMA (#34387 ) This is just an initial scope of tests. Much of this code was written by @ginayeh . I just did the final polish/integration step. There are 3 main tests included: 1. The GAMMA baseline test, including the [actual GAMMA API](https://gateway-api.sigs.k8s.io/geps/gep-1426/) rather than vendor extensions. 2. Kubernetes-based stateful session affinity tests, where the mesh (including SSA configuration) is configured using CRDs 3. GCP-based stateful session affinity tests, where the mesh is configured using the networkservices APIs directly Tests 1 and 2 will run in both prod and GKE staging, i.e. `container.googleapis.com` and `staging-container.sandbox.googleapis.com`. The latter of these will act as an early detection mechanism for regressions in the controller that translates Gateway resources into networkservices resources. Test 3 will run against `staging-networkservices.sandbox.googleapis.com` to act as an early detection mechanism for regressions in the control plane SSA implementation. The scope of the SSA tests is still fairly minimal. Session drain testing is in-progress but not included in this PR, though several elements required for it are (grace period, pre-stop hook, and the ability to kill a single pod in a deployment). --------- Co-authored-by: Jung-Yu (Gina) Yeh <ginayeh@google.com> Co-authored-by: Sergii Tkachenko <sergiitk@google.com>	1 year ago
Michael Lumish	2f05ddc278	[PSM Interop] Enable xDS affinity test for Node (#34288 ) Similar to #34146, this will only run on master for now. This will work after grpc/grpc-node#2568 is merged.	1 year ago
Sergii Tkachenko	50f4a14948	[PSM Interop] Update transitive dependencies (#34279 ) \| Package \| From \| To \| \| ------------------------ \| ---------- \| ---------- \| \| cachetools \| 5.3.0 \| 5.3.1 \| \| certifi \| 2022.12.7 \| 2023.7.22 \| \| charset-normalizer \| 3.0.1 \| 3.2.0 \| \| google-api-core \| 2.11.0 \| 2.11.1 \| \| google-auth \| 2.16.0 \| 2.22.0 \| \| googleapis-common-protos \| 1.58.0 \| 1.60.0 \| \| httplib2 \| 0.21.0 \| 0.22.0 \| \| MarkupSafe \| 2.1.2 \| 2.1.3 \| \| proto-plus \| 1.22.2 \| 1.22.3 \| \| pyparsing \| 3.0.9 \| 3.1.1 \| \| requests \| 2.28.2 \| 2.31.0 \| \| urllib3 \| 1.26.14 \| 1.26.16 \| \| websocket-client \| 1.5.1 \| 1.6.2 \|	1 year ago
Eugene Ostroukhov	4f80a4f9aa	[PSM Interop] Enable GRPC_EXPERIMENTAL_XDS_ENABLE_OVERRIDE_HOST (#34205 )	1 year ago
Arvind Bright	e4d598ab64	[PSM Interop] update td bootstrap generator image for prod tests (#34206 ) This change is to update the TD bootstrap generator for prod tests. This is part of the TD release process. The new image has already been merged to staging and tested locally in google3. cc: @sergiitk PTAL.	1 year ago
Sergii Tkachenko	430d358f70	[PSM Interop] Enable GAMMA test suite (#34193 ) Enable `grpc/core/master/linux/psm-gamma` test job that runs GAMMA test suite. Includes only the baseline test.	1 year ago
Sergii Tkachenko	af73b6061c	[PSM Interop] Fix Python 3.11 dependencies issues (#33432 ) Updated dependencies: - pyasn1 `0.5.0` -> `0.4.8`: https://pyasn1.readthedocs.io/en/latest/changelog.html - pyasn1-modules `0.2.8` -> `0.3.0`: https://github.com/pyasn1/pyasn1-modules/blob/main/CHANGES.txt - grpcio* `1.48.2` -> `1.57.0`: https://pypi.org/project/grpcio-status/ - protobuf `3.20.3` -> `4.24.1`: BREAKING: https://protobuf.dev/news/2022-05-06/ - xds-protos `0.0.11` -> `1.58.0rc1`: BREAKING - proto descriptors for `Any` messages need to be imported separately	1 year ago
Michael Lumish	caa176c079	[PSM Interop] Enable xDS custom LB test for Node (#34146 ) v1.10.x is the next version, so this will only run on master for now. This won't work until grpc/grpc-node#2555 is merged.	1 year ago
Sergii Tkachenko	2c5abd316d	[PSM Interop] Initial support for GAMMA tests (#34151 ) Adds initial support for K8s [GAMMA](https://gateway-api.sigs.k8s.io/concepts/gamma/) (Gateway API for Service Mesh) initiative. - Add framework support for loading CRD-based APIs using k8s python dynamic client - Add basic mesh baseline test (aka ping-pong) using GAMMA setup - Implement initial framework changes needed to run PSM tests on GAMMA-enabled cluster using [TDMesh](https://cloud.google.com/traffic-director/docs/gke-gateway-overview#gateway-api) and GRPCRoute. Based on https://github.com/grpc/grpc/pull/33504.	1 year ago
Sergii Tkachenko	bc98250a06	[PSM Interop] Update packaging to fix requirements-dev.txt (#34142 ) Previously black wouldn't install, as it required newer `packaging` package. This fixes `pip install -r requirements-dev.txt`. In addition, `black` in dev dependencies file is changed to `black[d]`, which bundles `blackd` binary (["black as a server"](https://black.readthedocs.io/en/stable/usage_and_configuration/black_as_a_server.html)).	1 year ago
Sergii Tkachenko	ecd7f2d936	[PSM Interop] Don't initialize the secondary context when not needed (#34130 ) Fixes an issue when an active context selected automatically picked up as context for `secondary_k8s_api_manager`. This was introducing an error in GAMMA Baseline PoC ``` sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('100.71.2.143', 56723), raddr=('35.199.174.232', 443)> ``` Here's how the secondary context is incorrectly falls back to the default context when `--secondary_kube_context` is not set: ``` k8s.py:142] Using kubernetes context "gke_grpc-testing_us-central1-a_psm-interop-security", active host: https://35.202.85.90 k8s.py:142] Using kubernetes context "None", active host: https://35.202.85.90 ```	1 year ago
Sergii Tkachenko	1708f631ee	[PSM Interop] Add unittests CI with github actions (#34125 ) - Add Github Action to conditionally run PSM Interop unit tests: - Only run when changes are detected in `tools/run_tests/xds_k8s_test_driver` or any of the proto files used by the driver - Only run against PRs and pushes to `master`, `v1..` branches - Runs using `python3.9` and `python3.10` - Ready to be added to the list of required GitHub checks - Add `tools/run_tests/xds_k8s_test_driver/tests/unit/__main__.py` test loader that recursively discovers all unit tests in `tools/run_tests/xds_k8s_test_driver/tests/unit` - Add basic coverage for `XdsTestClient` and `XdsTestServer` to verify the test loader picks up all folders Related: - First unit tests without automated CI added in #34097	1 year ago
Sergii Tkachenko	9413a34f1b	[PSM Interop] Fixes server language check with urlmap.timeout_test (#34127 ) The tests are skipped incorrectly because `config.server_lang` is incorrectly compared with the string value "java", instead of `skips.Lang.JAVA`. This has been broken since #26998. ``` xds_url_map_testcase.py:372] ----- Testing TestTimeoutInRouteRule ----- xds_url_map_testcase.py:373] Logs timezone: UTC skips.py:121] Skipping TestConfig(client_lang='java', server_lang='java', version='v1.57.x') [ SKIPPED ] setUpClass (timeout_test.TestTimeoutInRouteRule) xds_url_map_testcase.py:372] ----- Testing TestTimeoutInApplication ----- xds_url_map_testcase.py:373] Logs timezone: UTC skips.py:121] Skipping TestConfig(client_lang='java', server_lang='java', version='v1.57.x') [ SKIPPED ] setUpClass (timeout_test.TestTimeoutInApplication) ```	1 year ago
Sergii Tkachenko	ffbd3f89b1	[PSM Interop] Fix bad merge: NameError: name 'absltest' is not defined (#34128 ) Cause: #34097	1 year ago
Sergii Tkachenko	c867c418c1	[PSM Interop] Cover TestConfig version parsing with unit tests (#34097 ) This is to make sure upgrading packaging module won't break our logic on version-based version skipping. This also fixes a small issue with `dev-` prefix - it should only be allowed on the left side of the comparison. Context: packaging module needs to be upgraded to be compatible with `blackd`.	1 year ago
Xuan Wang	5fbc1a841b	[PSM Interop] Print stacktrace before test case teardown (#34023 ) * Before change, stacktrace will only be printed at the end of test, after change, we'll also print the stacktrace before test teardown. * Stacktrace format is similar to [unittest/runner.py](https://github.com/python/cpython/blob/3.10/Lib/unittest/runner.py#L112,L125) * Sample log for url_map (from [this test run](https://source.cloud.google.com/results/invocations/2345d431-6202-478b-97c2-2a1b64bd13c0)): ``` [ FAILED ] csds_test.TestBasicCsds.test_client_config I0810 18:36:25.332329 139712423387136 base_testcase.py:34] ----- TestCase csds_test.TestBasicCsds.test_client_config FAILED ----- E0810 18:36:25.332529 139712423387136 base_testcase.py:53] ERROR Traceback in: csds_test.TestBasicCsds.test_client_config: E0810 18:36:25.332598 139712423387136 base_testcase.py:54] Traceback (most recent call last): File "/tmp/tmp.qFY6iMjIkq/grpc/tools/run_tests/xds_k8s_test_driver/framework/xds_url_map_testcase.py", line 492, in test_client_config retryer(self._fetch_and_check_xds_config) File "/tmp/tmp.qFY6iMjIkq/grpc/tools/run_tests/xds_k8s_test_driver/venv/lib/python3.10/site-packages/tenacity/__init__.py", line 423, in __call__ do = self.iter(retry_state=retry_state) File "/tmp/tmp.qFY6iMjIkq/grpc/tools/run_tests/xds_k8s_test_driver/venv/lib/python3.10/site-packages/tenacity/__init__.py", line 369, in iter return self.retry_error_callback(retry_state=retry_state) File "/tmp/tmp.qFY6iMjIkq/grpc/tools/run_tests/xds_k8s_test_driver/framework/helpers/retryers.py", line 141, in error_handler raise RetryError( framework.helpers.retryers.RetryError: Retry error calling framework.xds_url_map_testcase.XdsUrlMapTestCase._fetch_and_check_xds_config: timeout 0:10:00 (h:mm:ss) exceeded. Last exception: AssertionError: 2 != 3 I0810 18:36:25.332715 139712423387136 xds_url_map_testcase.py:478] Aborting TestBasicCsds I0810 18:36:25.332844 139712423387136 xds_url_map_testcase.py:408] ----- TestCase TestBasicCsds teardown ----- ``` * Sample log when testing locally with multiple errors: ``` Running tests under Python 3.10.9: /usr/local/google/home/xuanwn/.pyenv/versions/310xds/bin/python I0814 17:07:22.553086 140334944188224 xds_k8s_testcase.py:130] ----- Testing BaselineTest ----- I0814 17:07:22.553292 140334944188224 xds_k8s_testcase.py:131] Logs timezone: UTC I0814 17:07:22.553575 140334944188224 skips.py:124] Detected language and version: TestConfig(client_lang='java', server_lang='java', version=None) I0814 17:07:22.592908 140334944188224 k8s.py:129] Using kubernetes context "gke_xuanwn-xds_us-central1-a_xds-k8s-interop-tests-cluster", active host: https://34.132.21.170 I0814 17:07:22.733086 140334944188224 k8s.py:129] Using kubernetes context "None", active host: https://34.144.123.58 [ RUN ] BaselineTest.test_another I0814 17:07:22.740435 140334944188224 xds_k8s_testcase.py:590] Test run resource prefix: xds-k8s-test, suffix: dev I0814 17:07:29.515993 140334944188224 xds_k8s_testcase.py:640] ----- TestMethod __main__.BaselineTest.test_another teardown ----- I0814 17:07:29.517023 140334944188224 xds_k8s_testcase.py:664] ----- Test client/server logs ----- I0814 17:07:29.517564 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-client I0814 17:07:29.517683 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-server [ FAILED ] BaselineTest.test_another I0814 17:07:29.518238 140334944188224 base_testcase.py:40] ----- TestCase __main__.BaselineTest.test_another FAILED ----- E0814 17:07:29.518383 140334944188224 base_testcase.py:61] FAILURE Traceback in __main__.BaselineTest.test_another: Traceback (most recent call last): File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/tests/baseline_test.py", line 34, in test_another self.assertEqual("test another", None) AssertionError: 'test another' != None [ RUN ] BaselineTest.test_another_2 I0814 17:07:29.518589 140334944188224 xds_k8s_testcase.py:590] Test run resource prefix: xds-k8s-test, suffix: dev I0814 17:07:29.535759 140334944188224 xds_k8s_testcase.py:640] ----- TestMethod __main__.BaselineTest.test_another_2 teardown ----- I0814 17:07:29.536597 140334944188224 xds_k8s_testcase.py:664] ----- Test client/server logs ----- I0814 17:07:29.536783 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-client I0814 17:07:29.536932 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-server [ OK ] BaselineTest.test_another_2 I0814 17:07:29.537171 140334944188224 base_testcase.py:53] ----- TestCase __main__.BaselineTest.test_another_2 PASSED ----- [ RUN ] BaselineTest.test_expect_error I0814 17:07:29.537445 140334944188224 xds_k8s_testcase.py:590] Test run resource prefix: xds-k8s-test, suffix: dev I0814 17:07:29.554539 140334944188224 xds_k8s_testcase.py:640] ----- TestMethod __main__.BaselineTest.test_expect_error teardown ----- I0814 17:07:29.555363 140334944188224 xds_k8s_testcase.py:664] ----- Test client/server logs ----- I0814 17:07:29.555552 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-client I0814 17:07:29.555636 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-server [ FAILED ] BaselineTest.test_expect_error I0814 17:07:29.555815 140334944188224 base_testcase.py:46] ----- TestCase __main__.BaselineTest.test_expect_error UNEXPECTEDLY SUCCEED ----- [ RUN ] BaselineTest.test_skip [ SKIPPED ] BaselineTest.test_skip I0814 17:07:29.555988 140334944188224 base_testcase.py:50] ----- TestCase __main__.BaselineTest.test_skip SKIPPED ----- I0814 17:07:29.556062 140334944188224 base_testcase.py:51] Reason for skipping: ['skip for once'] [ RUN ] BaselineTest.test_traffic_director_grpc_setup I0814 17:07:29.556408 140334944188224 xds_k8s_testcase.py:590] Test run resource prefix: xds-k8s-test, suffix: dev I0814 17:07:29.572924 140334944188224 xds_k8s_testcase.py:640] ----- TestMethod __main__.BaselineTest.test_traffic_director_grpc_setup teardown ----- I0814 17:07:29.573563 140334944188224 xds_k8s_testcase.py:664] ----- Test client/server logs ----- I0814 17:07:29.573725 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-client I0814 17:07:29.573856 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-server [ FAILED ] BaselineTest.test_traffic_director_grpc_setup I0814 17:07:29.574180 140334944188224 base_testcase.py:40] ----- TestCase __main__.BaselineTest.test_traffic_director_grpc_setup FAILED ----- E0814 17:07:29.574276 140334944188224 base_testcase.py:61] FAILURE Traceback in __main__.BaselineTest.test_traffic_director_grpc_setup: Traceback (most recent call last): File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/tests/baseline_test.py", line 31, in test_traffic_director_grpc_setup self.assertEqual("test grpc setup", None) AssertionError: 'test grpc setup' != None ====================================================================== FAIL: test_another (__main__.BaselineTest) BaselineTest.test_another ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/tests/baseline_test.py", line 34, in test_another self.assertEqual("test another", None) AssertionError: 'test another' != None ====================================================================== FAIL: test_traffic_director_grpc_setup (__main__.BaselineTest) BaselineTest.test_traffic_director_grpc_setup ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/tests/baseline_test.py", line 31, in test_traffic_director_grpc_setup self.assertEqual("test grpc setup", None) AssertionError: 'test grpc setup' != None ---------------------------------------------------------------------- Ran 5 tests in 7.021s FAILED (failures=2, skipped=1, unexpected successes=1) ```	1 year ago
Sergii Tkachenko	912cb59be5	[PSM Interop] Ignore DeprecationWarning: HTTPResponse.getheaders() (#34018 ) Disables the warning produced by kubernetes/client/rest.py calling the deprecated `urllib3.response.HTTPResponse.getheaders`, `urllib3.response.HTTPResponse.getheader` methods: ``` venv-test/lib/python3.9/site-packages/kubernetes/client/rest.py:44: DeprecationWarning: HTTPResponse.getheaders() is deprecated and will be removed in urllib3 v2.1.0. Instead access HTTPResponse.headers directly. return self.urllib3_response.getheaders() ``` This issue introduced by openapi-generator, and solved in `v6.4.0`. To fix the issue properly, kubernetes/python folks need to regenerate the library using newer openapi-generator. The most recent release `v27.2.0` still used openapi-generator [`v4.3.0`](https://github.com/kubernetes-client/python/blob/v27.2.0/kubernetes/.openapi-generator/VERSION). Since they release two times a year, and the 2 major version difference of openapi-generator, the fix may take a while. Created an issue in their repo: https://github.com/kubernetes-client/python/issues/2101.	1 year ago
Sergii Tkachenko	b4063a8c3f	[PSM Interop] Fixes the triager hint - add missing date, turn off the highlighter (#34030 ) Addresses some issues of the initial triage hint PR: https://github.com/grpc/grpc/pull/33898. 1. Print unhealthy backend name before the health info - previously it was unclear health status of which backend is dumped 2. Add missing `retry_err.add_note(note)` calls 3. Turn off the highlighter in triager hints, which isn't rendered properly in the stack trace saved to junit.xml	1 year ago
Xuan Wang	a329b5ef84	[PSM Interop] Change the log message from Client pods to Client container (#34029 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	1 year ago
Sergii Tkachenko	5ad0bfc7c7	[PSM Interop] Bump pip kubernetes to 27.2 (#34017 ) I was hoping this would solve the issue with `DeprecationWarning: HTTPResponse.getheaders() is deprecated`, but it didn't. Anyway, we should be updating this from time to time. Changelog: https://github.com/kubernetes-client/python/blob/release-27.0/CHANGELOG.md The client library changes from `25.3.0` to `27.2.0` are minimal. The majority of the changelog is API updates pulled from k8s upstream.	1 year ago
Sergii Tkachenko	bc41f18beb	[PSM Interop] Print the hint for the triager on blanket errors (#33898 ) This clearly indicates which errors are "blanket" errors and are not a root cause on their own. This also moves the debug info with the last known status of an object the framework was waiting for, but bailed out due to a timeout. Previously it was printed as the last error message in the test, and this PR prints it after the stack trace that caused the test failure. In addition, I added a similar debug information to the "wait for NEGs to become healthy". Now it prints the statuses of unhealthy backends To achieve that, I mimicked upcoming [PEP 678](https://peps.python.org/pep-0678/) Exception notes feature. When we're upgrade to py11, we'll be able to remove `add_note()` methods, and get the same functionality for free.	1 year ago
Zach Reyes	9a6fc5c9a1	[PSM Interop] Fix bootstrap generator interop test (#33893 ) This PR fixes the bootstrap generator interop test by making the node metadata flag dependent on version, which was causing a breakage previously as all bootstrap generator version's don't necessarily support the deexpiermentalized flag.	1 year ago
Sergii Tkachenko	23dc8fde78	[PSM Interop] Update and improve local-dev.cfg example (#33892 ) The most important change here is to setting `resource_suffix` and `server_xds_port` flags to "generate randomly" by default. Previously we were suggesting static values, and devs ended up with resource conflict errors.	1 year ago
Sergii Tkachenko	d779808e01	[PSM Interop] Fix PyYAML Cython build / Upgrade PyYAML to 6.0 (#33745 ) Upgrades PyYAML from 5.4.1 to 6.0 to address cython build issue: https://github.com/yaml/pyyaml/issues/601. Changelog: https://github.com/yaml/pyyaml/blob/master/CHANGES	1 year ago

1 2 3 4 5 ...

266 Commits (73950425c9f00d56ce8d6b9f2f36f69b6dce4a40)