Chiebot-Mirror/grpc - grpc - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Eugene Ostroukhov	4f80a4f9aa	[PSM Interop] Enable GRPC_EXPERIMENTAL_XDS_ENABLE_OVERRIDE_HOST (#34205 )	1 year ago
Arvind Bright	e4d598ab64	[PSM Interop] update td bootstrap generator image for prod tests (#34206 ) This change is to update the TD bootstrap generator for prod tests. This is part of the TD release process. The new image has already been merged to staging and tested locally in google3. cc: @sergiitk PTAL.	1 year ago
Sergii Tkachenko	430d358f70	[PSM Interop] Enable GAMMA test suite (#34193 ) Enable `grpc/core/master/linux/psm-gamma` test job that runs GAMMA test suite. Includes only the baseline test.	1 year ago
Sergii Tkachenko	af73b6061c	[PSM Interop] Fix Python 3.11 dependencies issues (#33432 ) Updated dependencies: - pyasn1 `0.5.0` -> `0.4.8`: https://pyasn1.readthedocs.io/en/latest/changelog.html - pyasn1-modules `0.2.8` -> `0.3.0`: https://github.com/pyasn1/pyasn1-modules/blob/main/CHANGES.txt - grpcio* `1.48.2` -> `1.57.0`: https://pypi.org/project/grpcio-status/ - protobuf `3.20.3` -> `4.24.1`: BREAKING: https://protobuf.dev/news/2022-05-06/ - xds-protos `0.0.11` -> `1.58.0rc1`: BREAKING - proto descriptors for `Any` messages need to be imported separately	1 year ago
Michael Lumish	caa176c079	[PSM Interop] Enable xDS custom LB test for Node (#34146 ) v1.10.x is the next version, so this will only run on master for now. This won't work until grpc/grpc-node#2555 is merged.	1 year ago
Sergii Tkachenko	2c5abd316d	[PSM Interop] Initial support for GAMMA tests (#34151 ) Adds initial support for K8s [GAMMA](https://gateway-api.sigs.k8s.io/concepts/gamma/) (Gateway API for Service Mesh) initiative. - Add framework support for loading CRD-based APIs using k8s python dynamic client - Add basic mesh baseline test (aka ping-pong) using GAMMA setup - Implement initial framework changes needed to run PSM tests on GAMMA-enabled cluster using [TDMesh](https://cloud.google.com/traffic-director/docs/gke-gateway-overview#gateway-api) and GRPCRoute. Based on https://github.com/grpc/grpc/pull/33504.	1 year ago
Sergii Tkachenko	bc98250a06	[PSM Interop] Update packaging to fix requirements-dev.txt (#34142 ) Previously black wouldn't install, as it required newer `packaging` package. This fixes `pip install -r requirements-dev.txt`. In addition, `black` in dev dependencies file is changed to `black[d]`, which bundles `blackd` binary (["black as a server"](https://black.readthedocs.io/en/stable/usage_and_configuration/black_as_a_server.html)).	1 year ago
Sergii Tkachenko	ecd7f2d936	[PSM Interop] Don't initialize the secondary context when not needed (#34130 ) Fixes an issue when an active context selected automatically picked up as context for `secondary_k8s_api_manager`. This was introducing an error in GAMMA Baseline PoC ``` sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('100.71.2.143', 56723), raddr=('35.199.174.232', 443)> ``` Here's how the secondary context is incorrectly falls back to the default context when `--secondary_kube_context` is not set: ``` k8s.py:142] Using kubernetes context "gke_grpc-testing_us-central1-a_psm-interop-security", active host: https://35.202.85.90 k8s.py:142] Using kubernetes context "None", active host: https://35.202.85.90 ```	1 year ago
Sergii Tkachenko	1708f631ee	[PSM Interop] Add unittests CI with github actions (#34125 ) - Add Github Action to conditionally run PSM Interop unit tests: - Only run when changes are detected in `tools/run_tests/xds_k8s_test_driver` or any of the proto files used by the driver - Only run against PRs and pushes to `master`, `v1..` branches - Runs using `python3.9` and `python3.10` - Ready to be added to the list of required GitHub checks - Add `tools/run_tests/xds_k8s_test_driver/tests/unit/__main__.py` test loader that recursively discovers all unit tests in `tools/run_tests/xds_k8s_test_driver/tests/unit` - Add basic coverage for `XdsTestClient` and `XdsTestServer` to verify the test loader picks up all folders Related: - First unit tests without automated CI added in #34097	1 year ago
Sergii Tkachenko	9413a34f1b	[PSM Interop] Fixes server language check with urlmap.timeout_test (#34127 ) The tests are skipped incorrectly because `config.server_lang` is incorrectly compared with the string value "java", instead of `skips.Lang.JAVA`. This has been broken since #26998. ``` xds_url_map_testcase.py:372] ----- Testing TestTimeoutInRouteRule ----- xds_url_map_testcase.py:373] Logs timezone: UTC skips.py:121] Skipping TestConfig(client_lang='java', server_lang='java', version='v1.57.x') [ SKIPPED ] setUpClass (timeout_test.TestTimeoutInRouteRule) xds_url_map_testcase.py:372] ----- Testing TestTimeoutInApplication ----- xds_url_map_testcase.py:373] Logs timezone: UTC skips.py:121] Skipping TestConfig(client_lang='java', server_lang='java', version='v1.57.x') [ SKIPPED ] setUpClass (timeout_test.TestTimeoutInApplication) ```	1 year ago
Sergii Tkachenko	ffbd3f89b1	[PSM Interop] Fix bad merge: NameError: name 'absltest' is not defined (#34128 ) Cause: #34097	1 year ago
Sergii Tkachenko	c867c418c1	[PSM Interop] Cover TestConfig version parsing with unit tests (#34097 ) This is to make sure upgrading packaging module won't break our logic on version-based version skipping. This also fixes a small issue with `dev-` prefix - it should only be allowed on the left side of the comparison. Context: packaging module needs to be upgraded to be compatible with `blackd`.	1 year ago
Xuan Wang	5fbc1a841b	[PSM Interop] Print stacktrace before test case teardown (#34023 ) * Before change, stacktrace will only be printed at the end of test, after change, we'll also print the stacktrace before test teardown. * Stacktrace format is similar to [unittest/runner.py](https://github.com/python/cpython/blob/3.10/Lib/unittest/runner.py#L112,L125) * Sample log for url_map (from [this test run](https://source.cloud.google.com/results/invocations/2345d431-6202-478b-97c2-2a1b64bd13c0)): ``` [ FAILED ] csds_test.TestBasicCsds.test_client_config I0810 18:36:25.332329 139712423387136 base_testcase.py:34] ----- TestCase csds_test.TestBasicCsds.test_client_config FAILED ----- E0810 18:36:25.332529 139712423387136 base_testcase.py:53] ERROR Traceback in: csds_test.TestBasicCsds.test_client_config: E0810 18:36:25.332598 139712423387136 base_testcase.py:54] Traceback (most recent call last): File "/tmp/tmp.qFY6iMjIkq/grpc/tools/run_tests/xds_k8s_test_driver/framework/xds_url_map_testcase.py", line 492, in test_client_config retryer(self._fetch_and_check_xds_config) File "/tmp/tmp.qFY6iMjIkq/grpc/tools/run_tests/xds_k8s_test_driver/venv/lib/python3.10/site-packages/tenacity/__init__.py", line 423, in __call__ do = self.iter(retry_state=retry_state) File "/tmp/tmp.qFY6iMjIkq/grpc/tools/run_tests/xds_k8s_test_driver/venv/lib/python3.10/site-packages/tenacity/__init__.py", line 369, in iter return self.retry_error_callback(retry_state=retry_state) File "/tmp/tmp.qFY6iMjIkq/grpc/tools/run_tests/xds_k8s_test_driver/framework/helpers/retryers.py", line 141, in error_handler raise RetryError( framework.helpers.retryers.RetryError: Retry error calling framework.xds_url_map_testcase.XdsUrlMapTestCase._fetch_and_check_xds_config: timeout 0:10:00 (h:mm:ss) exceeded. Last exception: AssertionError: 2 != 3 I0810 18:36:25.332715 139712423387136 xds_url_map_testcase.py:478] Aborting TestBasicCsds I0810 18:36:25.332844 139712423387136 xds_url_map_testcase.py:408] ----- TestCase TestBasicCsds teardown ----- ``` * Sample log when testing locally with multiple errors: ``` Running tests under Python 3.10.9: /usr/local/google/home/xuanwn/.pyenv/versions/310xds/bin/python I0814 17:07:22.553086 140334944188224 xds_k8s_testcase.py:130] ----- Testing BaselineTest ----- I0814 17:07:22.553292 140334944188224 xds_k8s_testcase.py:131] Logs timezone: UTC I0814 17:07:22.553575 140334944188224 skips.py:124] Detected language and version: TestConfig(client_lang='java', server_lang='java', version=None) I0814 17:07:22.592908 140334944188224 k8s.py:129] Using kubernetes context "gke_xuanwn-xds_us-central1-a_xds-k8s-interop-tests-cluster", active host: https://34.132.21.170 I0814 17:07:22.733086 140334944188224 k8s.py:129] Using kubernetes context "None", active host: https://34.144.123.58 [ RUN ] BaselineTest.test_another I0814 17:07:22.740435 140334944188224 xds_k8s_testcase.py:590] Test run resource prefix: xds-k8s-test, suffix: dev I0814 17:07:29.515993 140334944188224 xds_k8s_testcase.py:640] ----- TestMethod __main__.BaselineTest.test_another teardown ----- I0814 17:07:29.517023 140334944188224 xds_k8s_testcase.py:664] ----- Test client/server logs ----- I0814 17:07:29.517564 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-client I0814 17:07:29.517683 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-server [ FAILED ] BaselineTest.test_another I0814 17:07:29.518238 140334944188224 base_testcase.py:40] ----- TestCase __main__.BaselineTest.test_another FAILED ----- E0814 17:07:29.518383 140334944188224 base_testcase.py:61] FAILURE Traceback in __main__.BaselineTest.test_another: Traceback (most recent call last): File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/tests/baseline_test.py", line 34, in test_another self.assertEqual("test another", None) AssertionError: 'test another' != None [ RUN ] BaselineTest.test_another_2 I0814 17:07:29.518589 140334944188224 xds_k8s_testcase.py:590] Test run resource prefix: xds-k8s-test, suffix: dev I0814 17:07:29.535759 140334944188224 xds_k8s_testcase.py:640] ----- TestMethod __main__.BaselineTest.test_another_2 teardown ----- I0814 17:07:29.536597 140334944188224 xds_k8s_testcase.py:664] ----- Test client/server logs ----- I0814 17:07:29.536783 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-client I0814 17:07:29.536932 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-server [ OK ] BaselineTest.test_another_2 I0814 17:07:29.537171 140334944188224 base_testcase.py:53] ----- TestCase __main__.BaselineTest.test_another_2 PASSED ----- [ RUN ] BaselineTest.test_expect_error I0814 17:07:29.537445 140334944188224 xds_k8s_testcase.py:590] Test run resource prefix: xds-k8s-test, suffix: dev I0814 17:07:29.554539 140334944188224 xds_k8s_testcase.py:640] ----- TestMethod __main__.BaselineTest.test_expect_error teardown ----- I0814 17:07:29.555363 140334944188224 xds_k8s_testcase.py:664] ----- Test client/server logs ----- I0814 17:07:29.555552 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-client I0814 17:07:29.555636 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-server [ FAILED ] BaselineTest.test_expect_error I0814 17:07:29.555815 140334944188224 base_testcase.py:46] ----- TestCase __main__.BaselineTest.test_expect_error UNEXPECTEDLY SUCCEED ----- [ RUN ] BaselineTest.test_skip [ SKIPPED ] BaselineTest.test_skip I0814 17:07:29.555988 140334944188224 base_testcase.py:50] ----- TestCase __main__.BaselineTest.test_skip SKIPPED ----- I0814 17:07:29.556062 140334944188224 base_testcase.py:51] Reason for skipping: ['skip for once'] [ RUN ] BaselineTest.test_traffic_director_grpc_setup I0814 17:07:29.556408 140334944188224 xds_k8s_testcase.py:590] Test run resource prefix: xds-k8s-test, suffix: dev I0814 17:07:29.572924 140334944188224 xds_k8s_testcase.py:640] ----- TestMethod __main__.BaselineTest.test_traffic_director_grpc_setup teardown ----- I0814 17:07:29.573563 140334944188224 xds_k8s_testcase.py:664] ----- Test client/server logs ----- I0814 17:07:29.573725 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-client I0814 17:07:29.573856 140334944188224 k8s_base_runner.py:621] No completed deployments of psm-grpc-server [ FAILED ] BaselineTest.test_traffic_director_grpc_setup I0814 17:07:29.574180 140334944188224 base_testcase.py:40] ----- TestCase __main__.BaselineTest.test_traffic_director_grpc_setup FAILED ----- E0814 17:07:29.574276 140334944188224 base_testcase.py:61] FAILURE Traceback in __main__.BaselineTest.test_traffic_director_grpc_setup: Traceback (most recent call last): File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/tests/baseline_test.py", line 31, in test_traffic_director_grpc_setup self.assertEqual("test grpc setup", None) AssertionError: 'test grpc setup' != None ====================================================================== FAIL: test_another (__main__.BaselineTest) BaselineTest.test_another ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/tests/baseline_test.py", line 34, in test_another self.assertEqual("test another", None) AssertionError: 'test another' != None ====================================================================== FAIL: test_traffic_director_grpc_setup (__main__.BaselineTest) BaselineTest.test_traffic_director_grpc_setup ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/tests/baseline_test.py", line 31, in test_traffic_director_grpc_setup self.assertEqual("test grpc setup", None) AssertionError: 'test grpc setup' != None ---------------------------------------------------------------------- Ran 5 tests in 7.021s FAILED (failures=2, skipped=1, unexpected successes=1) ```	1 year ago
Sergii Tkachenko	912cb59be5	[PSM Interop] Ignore DeprecationWarning: HTTPResponse.getheaders() (#34018 ) Disables the warning produced by kubernetes/client/rest.py calling the deprecated `urllib3.response.HTTPResponse.getheaders`, `urllib3.response.HTTPResponse.getheader` methods: ``` venv-test/lib/python3.9/site-packages/kubernetes/client/rest.py:44: DeprecationWarning: HTTPResponse.getheaders() is deprecated and will be removed in urllib3 v2.1.0. Instead access HTTPResponse.headers directly. return self.urllib3_response.getheaders() ``` This issue introduced by openapi-generator, and solved in `v6.4.0`. To fix the issue properly, kubernetes/python folks need to regenerate the library using newer openapi-generator. The most recent release `v27.2.0` still used openapi-generator [`v4.3.0`](https://github.com/kubernetes-client/python/blob/v27.2.0/kubernetes/.openapi-generator/VERSION). Since they release two times a year, and the 2 major version difference of openapi-generator, the fix may take a while. Created an issue in their repo: https://github.com/kubernetes-client/python/issues/2101.	1 year ago
Sergii Tkachenko	b4063a8c3f	[PSM Interop] Fixes the triager hint - add missing date, turn off the highlighter (#34030 ) Addresses some issues of the initial triage hint PR: https://github.com/grpc/grpc/pull/33898. 1. Print unhealthy backend name before the health info - previously it was unclear health status of which backend is dumped 2. Add missing `retry_err.add_note(note)` calls 3. Turn off the highlighter in triager hints, which isn't rendered properly in the stack trace saved to junit.xml	1 year ago
Xuan Wang	a329b5ef84	[PSM Interop] Change the log message from Client pods to Client container (#34029 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	1 year ago
Sergii Tkachenko	5ad0bfc7c7	[PSM Interop] Bump pip kubernetes to 27.2 (#34017 ) I was hoping this would solve the issue with `DeprecationWarning: HTTPResponse.getheaders() is deprecated`, but it didn't. Anyway, we should be updating this from time to time. Changelog: https://github.com/kubernetes-client/python/blob/release-27.0/CHANGELOG.md The client library changes from `25.3.0` to `27.2.0` are minimal. The majority of the changelog is API updates pulled from k8s upstream.	1 year ago
Sergii Tkachenko	bc41f18beb	[PSM Interop] Print the hint for the triager on blanket errors (#33898 ) This clearly indicates which errors are "blanket" errors and are not a root cause on their own. This also moves the debug info with the last known status of an object the framework was waiting for, but bailed out due to a timeout. Previously it was printed as the last error message in the test, and this PR prints it after the stack trace that caused the test failure. In addition, I added a similar debug information to the "wait for NEGs to become healthy". Now it prints the statuses of unhealthy backends To achieve that, I mimicked upcoming [PEP 678](https://peps.python.org/pep-0678/) Exception notes feature. When we're upgrade to py11, we'll be able to remove `add_note()` methods, and get the same functionality for free.	1 year ago
Zach Reyes	9a6fc5c9a1	[PSM Interop] Fix bootstrap generator interop test (#33893 ) This PR fixes the bootstrap generator interop test by making the node metadata flag dependent on version, which was causing a breakage previously as all bootstrap generator version's don't necessarily support the deexpiermentalized flag.	1 year ago
Sergii Tkachenko	23dc8fde78	[PSM Interop] Update and improve local-dev.cfg example (#33892 ) The most important change here is to setting `resource_suffix` and `server_xds_port` flags to "generate randomly" by default. Previously we were suggesting static values, and devs ended up with resource conflict errors.	1 year ago
Sergii Tkachenko	d779808e01	[PSM Interop] Fix PyYAML Cython build / Upgrade PyYAML to 6.0 (#33745 ) Upgrades PyYAML from 5.4.1 to 6.0 to address cython build issue: https://github.com/yaml/pyyaml/issues/601. Changelog: https://github.com/yaml/pyyaml/blob/master/CHANGES	1 year ago
github-actions[bot]	c96f3dce4e	Automated fix for refs/heads/master (#33710 ) PanCakes to the rescue! We noticed that our 'sanity' test was going to fail, but we think we can fix that automatically, so we put together this PR to do just that! If you'd like to opt-out of these PR's, add yourself to NO_AUTOFIX_USERS in .github/workflows/pr-auto-fix.yaml Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago
Richard Belleville	eb7b72ea5b	[PSM Interop] Loosen affinity test assertions (#33648 ) As part of the dualstack backend designs, subchannels will be created lazily. Therefore, instead of asserting that there is 1 READY subchannel and `n - 1` IDLE subchannels, we just assert that there is 1 READY subchannel.	1 year ago
Sergii Tkachenko	8101a808ec	[PSM Interop] Set canonical server via canonical-* tag, not commit sha (#33588 ) While this is less transparent, it makes it significantly easier to deliver images with latest security patches.	1 year ago
Sergii Tkachenko	e9446ca20c	[PSM Interop] Bump URL Map canonical server from 1.40.0 to 1.56.0 (#33587 ) Similar to https://github.com/grpc/grpc/pull/33542. Note that there's a ticket to automatically use the one specified in the --server_image_canonical flag, but for now we just hardcode. ref b/261911148, b/282106799.	1 year ago
Sergii Tkachenko	2878e85ef5	[PSM Interop] Bump the canonical server from v1.48.1 to v1.56.0 (#33542 ) ### From: * Closest tag: `v1.48.1` * Branch: https://github.com/grpc/grpc-java/commits/v1.48.x * Commit: grpc/grpc-java@d56f8fbe1d2822bc4f91515dd471ad49493fc385 * Image: gcr.io/grpc-testing/xds-interop/java-server:d56f8fbe1d2822bc4f91515dd471ad49493fc385 ### To: * Closest tag: `v1.56.0` * Branch: https://github.com/grpc/grpc-java/commits/v1.56.x * Commit: grpc/grpc-java@558b5b0bfac8e21755c223063274a779b3898afe * Image: gcr.io/grpc-testing/xds-interop/java-server:558b5b0bfac8e21755c223063274a779b3898afe	1 year ago
Sergii Tkachenko	fb9e927322	[PSM Interop] Outlier detection: use native cpp server since v1.57.x (#33135 ) Related: - CPP implementation: `be99673d06`, `d55431995c`. - Feature request in go: https://github.com/grpc/grpc-go/issues/6288 - Feature request in python: https://github.com/grpc/grpc/issues/33134 FYI @eugeneo @zasweq --------- Co-authored-by: Eugene Ostroukhov <eostroukhov@gmail.com>	2 years ago
Sergii Tkachenko	181a24f546	[PSM Interop] Pip freeze breaks down into requirements and their deps (#33426 ) This is a no-op, just reordering `requirements.lock`. By providing `-r requirements.txt` to `pip freeze` it's able to break up dependencies required via `requirements.txt`, and sub-dependencies installed to satisfy them.	2 years ago
Sergii Tkachenko	de6ed9ba9f	[Python] Migrate from yapf to black (#33138 ) - Switched from yapf to black - Reconfigure isort for black - Resolve black/pylint idiosyncrasies Note: I used `--experimental-string-processing` because black was producing "implicit string concatenation", similar to what described here: https://github.com/psf/black/issues/1837. While currently this feature is experimental, it will be enabled by default: https://github.com/psf/black/issues/2188. After running black with the new string processing so that the generated code merges these `"hello" " world"` strings concatenations, then I removed `--experimental-string-processing` for stability, and regenerated the code again. To the reviewer: don't even try to open "Files Changed" tab 😄 It's better to review commit-by-commit, and ignore `run black and isort`.	2 years ago
Sergii Tkachenko	a95e8c4ff4	[PSM Interop] Improve assertRpcStatusCodes before/after stats logging (#33360 ) Do not clutter the final error we see at the end with the before/after stats. #### Examples ###### Expected only status A, but found status B for method M: ``` [ FAILED ] CustomLbTest.test_custom_lb_config ====================================================================== FAIL: test_custom_lb_config (__main__.CustomLbTest) CustomLbTest.test_custom_lb_config ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/sergiitk/Development/grpc/tools/run_tests/xds_k8s_test_driver/tests/custom_lb_test.py", line 113, in test_custom_lb_config self.assertRpcStatusCodes(test_client, File "/Users/sergiitk/Development/grpc/tools/run_tests/xds_k8s_test_driver/framework/xds_k8s_testcase.py", line 345, in assertRpcStatusCodes found_status = helpers_grpc.status_from_int(found_status_int) AssertionError: Expected only status (15, DATA_LOSS), but found status (0, OK) for method UNARY_CALL. Diff stats: - method: UNARY_CALL rpcs_started: 251 result: (0, OK): 251 ``` ###### Expected non-zero RPCs with status A for method M. ``` [ FAILED ] AuthzTest.test_plaintext_allow ====================================================================== FAIL: test_plaintext_allow (__main__.AuthzTest) AuthzTest.test_plaintext_allow ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/sergiitk/Development/grpc/tools/run_tests/xds_k8s_test_driver/tests/authz_test.py", line 224, in test_plaintext_allow self.configure_and_assert(test_client, 'host-wildcard', File "/Users/sergiitk/Development/grpc/tools/run_tests/xds_k8s_test_driver/tests/authz_test.py", line 204, in configure_and_assert self.assertRpcStatusCodes(test_client, File "/Users/sergiitk/Development/grpc/tools/run_tests/xds_k8s_test_driver/framework/xds_k8s_testcase.py", line 355, in assertRpcStatusCodes self.assertGreater(stats.result[expected_status_int], AssertionError: 0 not greater than 0 : Expected non-zero completed RPCs with status (0, OK) for method EMPTY_CALL. Diff stats: - method: EMPTY_CALL rpcs_started: 13 result: {} ```	2 years ago
Sergii Tkachenko	805a0dd48b	[PSM Interop] Prettify LB stats output (#33261 ) Improvements to the `LoadBalancerAccumulatedStatsRequest` output. Makes it readable. This greatly affects `assertRpcStatusCodes()` output, used in authz and custom_lb. No before and after stats, just useful diff stats from now. Minimal and readable. Also diff stats have `rpcs_started` now. ![image](https://github.com/grpc/grpc/assets/672669/a4e38d82-be5a-4f31-9d88-da2bf9712d9b) Output example: ``` --- Starting subTest __main__.AuthzTest.test_plaintext_allow.01_host_wildcard --- [psm-grpc-client-765bfbf868-jqjm7:51561] >> RPC LoadBalancerStatsService.GetClientAccumulatedStats(request=LoadBalancerAccumulatedStatsRequest({}), wait_for_ready=True, timeout=600) [psm-grpc-client-765bfbf868-jqjm7:51561] >> RPC XdsUpdateClientConfigureService.Configure(request=ClientConfigureRequest({'types': ['EMPTY_CALL'], 'metadata': [{'key': 'test', 'value': 'host-wildcard'}]}), timeout=5, wait_for_ready=True) [psm-grpc-client-765bfbf868-jqjm7:51561] >> RPC LoadBalancerStatsService.GetClientAccumulatedStats(request=LoadBalancerAccumulatedStatsRequest({}), wait_for_ready=True, timeout=600) [psm-grpc-client-765bfbf868-jqjm7:51561] >> RPC LoadBalancerStatsService.GetClientAccumulatedStats(request=LoadBalancerAccumulatedStatsRequest({}), wait_for_ready=True, timeout=600) [psm-grpc-client-765bfbf868-jqjm7] << Received accumulated stats difference. Expecting RPCs with status (0, OK) for method EMPTY_CALL. - method: EMPTY_CALL rpcs_started: 13 result: (0, OK): 14 --- Finished subTest __main__.AuthzTest.test_plaintext_allow.01_host_wildcard --- ``` In case of test failure, it'll still print all stats at the end, including before and after: ``` AssertionError: Expected only status (15, DATA_LOSS), but found status (0, OK) for method UNARY_CALL. Stats before: - method: UNARY_CALL rpcs_started: 2153 result: (14, UNAVAILABLE): 1674 (0, OK): 479 Stats after: - method: UNARY_CALL rpcs_started: 2404 result: (0, OK): 730 (14, UNAVAILABLE): 1674 Diff stats: - method: UNARY_CALL rpcs_started: 251 result: (0, OK): 251 ``` And as I was at it, also made `LoadBalancerStatsResponse` nice: ![image](https://github.com/grpc/grpc/assets/672669/b15908a7-bae4-41a0-a2f7-c903e398432a)	2 years ago
Sergii Tkachenko	531a6be335	[PSM Interop] Fix an issue with restarting k8s runner (#33280 ) Fixes the issue introduced in https://github.com/grpc/grpc/pull/33104, where stopping the current run didn't reset `self.time_start_requested`, `self.time_start_completed`, `self.time_start_stopped`. Because of this, the subsetting test (the only one [redeploying the client app](`10001d16a9/tools/run_tests/xds_k8s_test_driver/tests/subsetting_test.py (L73C1-L74)`)) started failing with: ```py Traceback (most recent call last): File "xds_k8s_test_driver/tests/subsetting_test.py", line 76, in test_subsetting_basic test_client: _XdsTestClient = self.startTestClient( File "xds_k8s_test_driver/framework/xds_k8s_testcase.py", line 615, in startTestClient test_client = self.client_runner.run(server_target=test_server.xds_uri, File "xds_k8s_test_driver/framework/test_app/runners/k8s/k8s_xds_client_runner.py", line 110, in run super().run() File "xds_k8s_test_driver/framework/test_app/runners/k8s/k8s_base_runner.py", line 112, in run raise RuntimeError( RuntimeError: Deployment psm-grpc-client: has already been started at 2023-05-27T13:47:15.262461 ``` This PR: 1. Instead of relying on the `time_start_requested`, `time_start_stopped` to produce GCP links, tracks the history run of each deployment. This fixes the issue described above, and adds support for listing all past runs executed by a k8s runner. 2. Minor: remove the unnecessary call to `test_client.cleanup()` when there's no past deployment runs (e.g. at the first iteration of `for i in range(_NUM_CLIENTS):`)	2 years ago
Sergii Tkachenko	acbac3deee	[PSM Interop] Output GCP log links at the end of the test (#33104 ) Output example: ``` I0512 21:33:09.879951 140704627381056 xds_k8s_testcase.py:482] ----- TestMethod __main__.BaselineTest.test_traffic_director_grpc_setup teardown ----- I0512 21:33:09.880299 140704627381056 traffic_director.py:582] Deleting Forwarding rule "sergiitk-forwarding-rule-20230513-0428-aolg9" ... I0512 21:35:53.437206 140704627381056 xds_k8s_testcase.py:491] ----- Test client/server logs ----- I0512 21:35:53.437497 140704627381056 k8s_base_runner.py:491] GCP Logs Explorer link to psm-grpc-client: https://pantheon.corp.google.com/logs/query;query=resource.type%3D%22k8s_container%22%0Aresource.labels.project_id%3D%22sergiitk-grpc-gke%22%0Aresource.labels.container_name%3D%22psm-grpc-client%22%0Aresource.labels.namespace_name%3D%22sergiitk-client-20230513-0428-aolg9%22;timeRange=2023-05-13T04:30:45.361596Z%2F2023-05-13T04:34:41.227014Z?project=sergiitk-grpc-gke I0512 21:35:53.437677 140704627381056 k8s_base_runner.py:491] GCP Logs Explorer link to psm-grpc-server: https://pantheon.corp.google.com/logs/query;query=resource.type%3D%22k8s_container%22%0Aresource.labels.project_id%3D%22sergiitk-grpc-gke%22%0Aresource.labels.container_name%3D%22psm-grpc-server%22%0Aresource.labels.namespace_name%3D%22sergiitk-server-20230513-0428-aolg9%22;timeRange=2023-05-13T04:29:41.171388Z%2F2023-05-13T04:35:53.437068Z?project=sergiitk-grpc-gke [ OK ] BaselineTest.test_traffic_director_grpc_setup ```	2 years ago
Sergii Tkachenko	e23237b780	[PSM Interop] Enable custom LB in golang >= 1.56.x (#33058 ) Enables custom LB in golang >= 1.56.x (including `master`). Uses the canonical java server for everything except CPP and Java.	2 years ago
Sergii Tkachenko	5207665738	[PSM Interop] URL Map: register the cleanup hook early (#33137 ) I've noticed we add the cleanup hook after setting up the infrastructure. Thus, if infra setup failed, the cleanup won't work. This fixes it, and adds extra checks to not call `cls.test_client_runner` if it's not set.	2 years ago
Xuan Wang	6818c8740f	[PSM Interop] Fail test if client or server pods restarted during test (#33040 ) Fail test if client or server pods restarted during test. #### Testing Tested locally, test will fail with message similar to: ``` ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/framework/xds_k8s_testcase.py", line 501, in tearDown )) AssertionError: 5 != 0 : Server pods unexpectedly restarted {sever_restarts} times during test. ---------------------------------------------------------------------- Ran 1 test in 886.867s ```	2 years ago
Sergii Tkachenko	87a0eb46be	[PSM Interop] Readable grpc status codes in error messages (#33103 ) Better logging for `assertRpcStatusCodes`. (got tired of looking up the status names) #### Unexpected status found Before: ``` AssertionError: AssertionError: Expected only status 15 but found status 0 for method UNARY_CALL: stats_per_method { key: "UNARY_CALL" value { result { key: 0 value: 251 } } } ``` After: ``` AssertionError: Expected only status (15, DATA_LOSS), but found status (0, OK) for method UNARY_CALL: stats_per_method { key: "UNARY_CALL" value { result { key: 0 value: 251 } } } ``` #### No traffic with expected status Before: ``` AssertionError: 0 not greater than 0 ``` After: ``` AssertionError: 0 not greater than 0 : Expected non-zero RPCs with status (15, DATA_LOSS) for method UNARY_CALL, got: stats_per_method { key: "UNARY_CALL" value { result { key: 0 value: 251 } result { key: 15 value: 0 } } } ```	2 years ago
Sergii Tkachenko	691a5fba87	[PSM Interop] Do not dump full Channel/Subchannel into logs (#33105 ) Before this change, `Found subchannel in state READY` and `Channel to xds:///psm-grpc-server:61404 transitioned to state ` would dump the full channel/subchannel, in some implementations that expose ChannelData.trace (f.e. go) would add 300 extra lines of log. Now we print a brief repr-like chanel/subchannel info: ``` Found subchannel in state READY: <Subchannel subchannel_id=9 target=10.110.1.44:8080 state=READY> Channel to xds:///psm-grpc-server:61404 transitioned to state READY: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=READY> ``` Also while waiting for the channel, we log channel_id now too: ``` Waiting to report a READY channel to xds:///psm-grpc-server:61404 Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=TRANSIENT_FAILURE> Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=TRANSIENT_FAILURE> Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=TRANSIENT_FAILURE> Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=TRANSIENT_FAILURE> Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=TRANSIENT_FAILURE> Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=READY> ```	2 years ago
Sergii Tkachenko	ca821574cc	[PSM Interop] gitignore all config/local-*.cfg (#33101 ) So we can add other local flagfiles if we want.	2 years ago
Sergii Tkachenko	0d678a9551	[PSM Interop] URL Map graceful teardown (#33090 ) Similar to what we already do in other test suites: - Try cleaning up resources three times. - If unsuccessful, don't fail the test and just log the error. The cleanup script should be the one to deal with this. ref b/282081851	2 years ago
Sergii Tkachenko	c182e6b252	[PSM Interop] Allow `dev` TESTING_VERSION that doesn't override images (#33062 ) Resolve `TESTING_VERSION` to `dev-VERSION` when the job is initiated by a user, and not the CI. Override this behavior with setting `FORCE_TESTING_VERSION`. This solves the problem with the manual job runs executed against a WIP branch (f.e. a PR) overriding the tag of the CI-built image we use for daily testing. The `dev` and `dev-VERSION` "magic" values supported by the `--testing_version` flag: - `dev` and `dev-master` and treated as `master`: all `config.version_gte` checks resolve to `True`. - `dev-VERSION` is treated as `VERSION`: `dev-v1.55.x` is treated as simply `v1.55.x`. We do this so that when manually running jobs for old branches the feature skip check still works, and unsupported tests are skipped. This changes will take care of all langs/branches, no backports needed. ref b/256845629	2 years ago
Sergii Tkachenko	ac4b2233e2	[PSM Interop] Readme: also recommend enabling logging and monitoring (#33059 ) Since we use Logs Explorer.	2 years ago
Sergii Tkachenko	448084c186	[PSM Interop] Improve error messages in the affinity test (#33031 ) Previously the error message didn't provide much context, example: ```py Traceback (most recent call last): File "/tmpfs/tmp/tmp.BqlenMyXyk/grpc/tools/run_tests/xds_k8s_test_driver/tests/affinity_test.py", line 127, in test_affinity self.assertLen( AssertionError: [] has length of 0, expected 1. ``` ref b/279990584.	2 years ago
Xuan Wang	7f7a524a9a	[PSM Interop] Add install kubectl authentication plugin to README (#33029 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. --> --------- Co-authored-by: Sergii Tkachenko <hi@sergii.org>	2 years ago
Sergii Tkachenko	1267000bbb	[PSM Interop] Minor fixes to the `bin/cleanup_cluster.sh` helper (#32953 )	2 years ago
Sergii Tkachenko	c0ee9ff4d5	[PSM Interop] Various improvements to the helper scripts (#32745 ) - Fix broken `bin/run_channelz.py` helper - Create `bin/run_ping_pong.py` helper that runs the baseline (aka "ping_pong") test against preconfigured infra - Setup automatic port forwarding when running `bin/run_channelz.py` and `bin/run_ping_pong.py` - Create `bin/cleanup_cluster.sh` helper to wipe xds out resources based namespaces present on the cluster Note: this involves a small change to the non-helper code, but it's just moving a the part that makes XdsTestServer/XdsTestClient instance for a given pod.	2 years ago
Sergii Tkachenko	2fe7b5b881	[PSM Interop] Temporary remedy for the issue with pod log dups (#32922 ) While a proper fix is on the way, this mitigates the number of duplicated container logs in the xds test server/client pod logs. The issue is that we only wait between stream restarts when an exception is caught, which isn't always the reason the stream gets broken. Another reason is the main container being shut down by k8s. In this situation, we essentially do ```py while True: try: restart_stream() read_all_logs_from_pod_start() except Exception: logger.warning('error') wait_seconds(1) ``` This PR makes it ```py while True: try: restart_stream() read_all_logs_from_pod_start() except Exception: logger.warning('error') finally: wait_seconds(5) ```	2 years ago
Larry Safran	4cb69f4658	[PSM Interop] Fix the issue with URL Map test suite not cleaning up failed test client (#32877 ) `tearDownClass` is not executed when `setUpClass` failed. In URL Map test suite, this leads to a test client that failed to start not being cleaned up. This PR change the URL Map test suite to register a custom `addClassCleanup` callback, instead of relying on the `tearDownClass`. Unlike `tearDownClass`, cleanup callbacks are executed when the `setUpClass` failed. ref b/276761453	2 years ago
Sergii Tkachenko	f2a7f6d51b	[PSM Interop] Increase k8s startup probe total time (#32875 ) Previously, we didn't configure the failureThreshold, so it used its default value. The final `startupProbe` looked like this: ```json { "startupProbe": { "failureThreshold": 3, "periodSeconds": 3, "successThreshold": 1, "tcpSocket": { "port": 8081 }, "timeoutSeconds": 1 } ``` Because of it, the total time before k8s killed the container was 3 times `failureThreshold` * 3 seconds wait between probes `periodSeconds` = 9 seconds total (±3 seconds waiting for the probe response). This greatly affected PSM Security test server, some implementations of which waited for the ADS stream to be configured before starting listening on the maintenance port. This lead for the server container being killed for ~7 times before a successful startup: ``` 15:55:08.875586 "Killing container with a grace period" 15:53:38.875812 "Killing container with a grace period" 15:52:47.875752 "Killing container with a grace period" 15:52:38.874696 "Killing container with a grace period" 15:52:14.874491 "Killing container with a grace period" 15:52:05.875400 "Killing container with a grace period" 15:51:56.876138 "Killing container with a grace period" ``` These extra delays lead to PSM security tests timing out. ref b/277336725	2 years ago
Sergii Tkachenko	6b5faff6d9	[PSM Interop] Switch to python 3.9 (#32854 ) Kokoro supports Python 3.9 now. Needed for b/276761453.	2 years ago

1 2 3 4 5

237 Commits (efc3843fb771a44cc187bf95b24f13302a49279c)