Chiebot-Mirror/grpc - grpc - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Sergii Tkachenko	326a88a9a2	xDS interop: Handle the edge case when rand deployment_id is all nums (#30901 ) Fun edge case: when `rand_string()` happen to generate numbers only, yaml interprets `deployment_id` label value as an integer, but k8s expects label values to be strings. K8s responds with a barely readable 400 Bad Request error: `ReadString: expects \" or n, but found 9, error found in #10 byte of ...\|ent_id`. Prepending deployment name forces deployment_id into a string, as well as it's just a better description.	2 years ago
Sergii Tkachenko	48aa1376bf	xDS interop: increase pod wait timeout from 1 minute to 3 minutes (#30770 )	2 years ago
Sergii Tkachenko	1837dae106	xDS interop: custom before_sleep_log only logging primitive returns (#30757 ) When we use retryers with `log_level=logging.INFO`, tenacity logs the result value (or an exception) after each unsuccessful retry attempt. We often retry methods that return objects, resulting in unreadable log messages: ``` I0820 03:16:29.027635 140613877811008 before_sleep.py:45] Retrying framework.xds_k8s_testcase.IsolatedXdsKubernetesTestCase.cleanup in 10.0 seconds as it raised RetryError: RetryError[Attempts: 21, Value: {'api_version': 'v1', 'kind': 'Namespace', 'metadata': {'annotations': None, 'cluster_name': None, 'creation_timestamp': datetime.datetime(2022, 8, 20, 2, 55, 32, tzinfo=tzlocal()), 'deletion_grace_period_seconds': None, 'deletion_timestamp': datetime.datetime(2022, 8, 20, 3, 6, 27, tzinfo=tzlocal()), 'finalizers': None, 'generate_name': None, 'generation': None, 'labels': {'kubernetes.io/metadata.name': 'psm-interop-server-20220820-0253-yrmam', 'name': 'psm-interop-server-20220820-0253-yrmam', 'owner': 'xds-k8s-interop-test'}, 'managed_fields': [{'api_version': 'v1', 'fields_type': 'FieldsV1', 'fields_v1': {'f:metadata': {'f:labels': {'.': {}, 'f:kubernetes.io/metadata.name': {}, ... (82 more lines) ``` This PR introduces custom `before_sleep` logger, that only logs the value if it's a primitive: `int, str, bool`. Otherwise, it logs the type, example: ``` k8s_base_runner.py:311] Waiting for pod psm-grpc-client-5d5648478f-7vsf7 to start retryers.py:192] Retrying framework.infrastructure.k8s.KubernetesNamespace.get_pod in 1.0 seconds as it returned type <class 'kubernetes.client.models.v1_pod.V1Pod'>. retryers.py:192] Retrying framework.infrastructure.k8s.KubernetesNamespace.get_pod in 1.0 seconds as it returned type <class 'kubernetes.client.models.v1_pod.V1Pod'>. ``` Note that this only changes the behavior of the unsuccessful retries, and doesn't affect the new feature that prints formatted k8s status field on if the final retry attempt failed.	2 years ago
Sergii Tkachenko	1107617282	xDS interop: collect pod logs (#30594 ) - Added support for pod log collection. To enable, set `--collect_app_logs` flag, and specify `--log_dir`. - Added support and helpers for operating on the `--log_dir` (natively provided by absl) - Added support for `--follow` to `bin/run_test_server.py` and `bin/run_test_client.py` to follow pod logs printed to stdout - Moved `PortForwarder` from k8s.py to its own file The collection itself will be enabled per-suite in https://github.com/grpc/grpc/pull/30735.	2 years ago
Sergii Tkachenko	0932b426f1	xDS interop: Fix default resource prefix (#30754 ) * xDS interop: Fix default resource prefix No longer just security tests. This is done to avoid confusion when debugging resources managed by the LB tests. * s/xds/psm	2 years ago
Sergii Tkachenko	283e9665f1	xDS interop: s/server_image_universal/server_image_canonical/g (#30740 )	2 years ago
Sergii Tkachenko	a3b535dd58	xDS interop: fix the order of rpc-behavior operations (#30739 ) To account for the API change made in https://github.com/grpc/grpc-java/pull/9171.	2 years ago
Sergii Tkachenko	666ea7cd21	xDS interop: fix an issue with secondary zone namespaces not cleaned (#30717 ) All alternative server runners except the failover test reuse the primary server runners' namespace. Failover test is using the secondary cluster, and manages its own namespace there. `reuse_namespace` disables namespace cleanup, and in this case it was set to `True` incorrectly.	2 years ago
Sergii Tkachenko	5abe970123	xDS interop: Improve retry logic and logging for the k8s retry operations (#30607 ) - Changes the order of waiting for pods to start: wait for the pods first, then for the deployment to transition to active. This should provide more useful information in the logs, showing exactly why the pod didn't start, instead of generic "Replicas not available" ref b/200293121. This also needed for https://github.com/grpc/grpc/pull/30594 - Add support for `check_result` callback in the retryer helpers - Completely replaces `retrying` with `tenacity`, ref b/200293121. Retrying is not longer maintained. - Improves the readability of timeout errors: now they contain the timeout (or the attempt number) exceeded, and information why the timeout failed (exception/check function): Before: > `tenacity.RetryError: RetryError[<Future at 0x7f8ce156bc18 state=finished returned dict>]` After: > `framework.helpers.retryers.RetryError: Retry error calling framework.infrastructure.k8s.KubernetesNamespace.get_pod: timeout 0:01:00 exceeded. Check result callback returned False.` - Improves the readability of the k8s wait operation errors: now the log includes colorized and formatted status of the k8s object being watched, instead of dumping the full k8s object. For example, here's how an error caused by using incorrect TD bootstrap image:	2 years ago
Sergii Tkachenko	0a150b246b	Enable outlier detection k8s interop test for Java >= 1.49.x (#30668 ) * Enable outlier detection k8s interop test for Java. (#30641) * xDS interop: enable outlier detection Java tests in >= 1.49.x Co-authored-by: Terry Wilson <terrymwilson@gmail.com>	2 years ago
Sergii Tkachenko	f93c1ef1fe	Revert "Enable outlier detection k8s interop test for Java. (#30641 )" (#30665 ) This reverts commit `78a77bfaa2`.	2 years ago
Sergii Tkachenko	74bd2d8360	xDS interop: Replace pod_name with hostname (#30643 ) pod_name shouldn't be a part of the test app, it's purely k8s' idiom. Originally server_id was intended for this purpose, but it was missed when support for multiple server replicas added. This replaces pod_name and server_id with hostname and improves replica-specific log messages, so it's clear to what server RPCs are issued. In addition, now all RPC logs are annotated with the hostname:port, so the destination is clear. Before: ``` server_app.py:76] Setting health status to serving grpc.py:60] RPC XdsUpdateHealthService.SetServing(request=Empty({}), timeout=90, wait_for_ready=True) grpc.py:60] RPC Health.Check(request=HealthCheckRequest({}), timeout=90, wait_for_ready=True) server_app.py:78] Server reports status: SERVING ``` After: ``` server_app.py:89] [psm-grpc-server-69bcf749c5-bg4x5] Setting health status to NOT_SERVING grpc.py:72] [psm-grpc-server-69bcf749c5-bg4x5:52902] RPC XdsUpdateHealthService.SetNotServing(request=Empty({}), timeout=90, wait_for_ready=True) grpc.py:72] [psm-grpc-server-69bcf749c5-bg4x5:52902] RPC Health.Check(request=HealthCheckRequest({}), timeout=90, wait_for_ready=True) server_app.py:92] [psm-grpc-server-69bcf749c5-bg4x5] Health status status: NOT_SERVING ``` Similarly, this adds hostname to the client app, mainly for logging.	2 years ago
Terry Wilson	78a77bfaa2	Enable outlier detection k8s interop test for Java. (#30641 )	2 years ago
Sergii Tkachenko	620c174e8c	xDS interop: Use ref server in py tests when set_not_serving needed (#30636 ) In python tests that require set_not_serving server RPC, override the python server with the reference server (Java) because the python server doesn't yet support set_not_serving RPC. Ref https://github.com/grpc/grpc/issues/30635.	2 years ago
Sergii Tkachenko	7712f93805	xDS interop: Generate deployment_id match label (#30596 ) This fixes an issue with KubernetesNamespace.list_deployment_pods() as well as the deployment itself would select incorrect pods when multiple deployments share the same namespace.	2 years ago
Sergii Tkachenko	3817db13b6	xDS interop: Move k8s-specific logic out of the test app (#30591 ) Separates xDS Test Client/Server (represent an interface to corresponding workload running remotely) from their runners (kubernetes-specific logic to provision the workloads with prerequisites). This is a refactoring, should not change the behavior.	2 years ago
Sergii Tkachenko	5d0e744da6	xDS interop: override retriable cleanup instead of tearDown (#30540 ) Some tests override unittest's `tearDown()`, which is not wrong, but less resilient than overriding custom `cleanup()` that is being retried in framework's `tearDown()`.	2 years ago
Sergii Tkachenko	1ed5b24f35	xDS interop: add support for the reference xds test server (#30519 ) - xDS interop: add support for the reference xds test server - Set default xDS test server reference to Java `v1.48.1` - Override xDS test server with the reference in Outlier Detection	2 years ago
Sergii Tkachenko	351bfad1f7	xDS interop: log the subTest start and beginning (#30517 ) To improve debugging of the tests with steps that look similar, f.e. failover. Makes the end of one subtest, and the beginning of the next one much clearer. Note: URL map test suite does not use subtests, so I didn't add the logging there.	2 years ago
Yash Tibrewal	ebfb028a29	xDS Interop: C++ xDS Authz supported after 1.47 (#30505 )	2 years ago
Sergii Tkachenko	3ab8b2ee62	xDS interop: update td bootstrap from v0.12.0-rc1 to v0.14.0 (#30455 ) Corresponding changes to the k8s manifests: - Remove `include-v3-features-experimental`: it's enabled by default as of v0.11.0 - Remove `include-psm-security-experimental`: enabled by default as of v0.12.0 - Rename `node-metadata-experimental` to node-metadata: stabilized as of v0.13.0 https://github.com/GoogleCloudPlatform/traffic-director-grpc-bootstrap/releases/tag/v0.12.0-rc1 https://github.com/GoogleCloudPlatform/traffic-director-grpc-bootstrap/releases/tag/v0.14.0	2 years ago
Sergii Tkachenko	5bc413b736	xDS interop: set default socket timeout to 60. (#30451 ) `kubernetes` library does not provide a way to configure the default socket timeout that will be used with `urllib3` it uses under the hood. And `urllib3` default socket timeout is infinity. This PR sets the default socket timeout using python's `socket.setdefaulttimeout()` to 60 seconds. This affects `urllib3` directly, and therefore `kubernetes`. The changes is also picked up by the `google-api-python-client`, which does not use `urllib3` (it uses `httplib2`), but [respectes](https://googleapis.github.io/google-api-python-client/docs/epy/googleapiclient.http-module.html#build_http) `socket.setdefaulttimeout()`.	2 years ago
Sergii Tkachenko	9077532620	xds interop: Log operation id (#30407 ) Add consistent operation id logs for GCP long-running operations - both old-style (compute) and the new APIs. At the moment it's a bit more verbose than I'd want, f.e. it doubles the number of log messages during the teardown. We should probably only log failed ops. But to do this reliably, we should probably revisit the issue with improving tenacity retry error fail reports.	2 years ago
Michael Lumish	f3c57aab0a	Add outlier detection xDS interop test using k8s interop framework (#30250 ) * Add xDS interop test for outlier detection This implements the test described in #29623, and plumbing for setting the outlierDetection field in the backend service config. The changes in this PR are very similar to #29688. * Fix use of configure method * Correct copy/paste error * Fix metadata configuration syntax * Increase QPS, use just one method * Format code * Apply suggestions from code review Co-authored-by: Sergii Tkachenko <hi@sergii.org> * Address review comments * Only Java implements the required server features * Automated change: Fix sanity tests * Address review comments * Use double quotes for docstring Co-authored-by: Sergii Tkachenko <hi@sergii.org> Co-authored-by: Sergii Tkachenko <hi@sergii.org> Co-authored-by: murgatroid99 <murgatroid99@users.noreply.github.com>	2 years ago
Sergii Tkachenko	a9c2f80a53	xds interop: resume failover tests on all branches (#30344 ) Resume the failover test on all branches, now that the following PRs were backported to all branches: - https://github.com/grpc/grpc-go/pull/5508 - https://github.com/grpc/grpc-java/pull/9389 Continues #30308 ref b/238226704	2 years ago
Sergii Tkachenko	f0edd0e44a	xds interop: resume failover tests (#30308 ) Resume the failover test. For now, just on master. Will be resumed on other branches, when the fix is backported. At the moment, the master is fixed in java and go. ref b/238226704	2 years ago
yifeizhuang	672be80d84	disable cpp failover test (#30293 )	2 years ago
Michael Lumish	a3c9b90705	xDS k8s tests: add Node skips for unsupported tests (#30279 )	2 years ago
yifeizhuang	d9708884e9	xds interop test: temporarily disable failover test (#30243 ) Temporarily disable failover test for Java and Go	2 years ago
Terry Wilson	a4f0ce13a7	Include an invalid config to custom LB test (#30236 ) This will verify that the configuration for a missing LB gets skipped in favor of the valid one.	2 years ago
Terry Wilson	d64e200db4	Add is_supported function to the custom lb test (#30222 )	2 years ago
Sergii Tkachenko	8d4c9a6f99	xds-k8s: Fix assertRpcsEventuallyGoToGivenServers not raising (#30224 ) All tests that use `assertRpcsEventuallyGoToGivenServers` method were reporting successes when the assertion failed: - FailoverTest - ChangeBackendServiceTest - RemoveNegTest	2 years ago
Sergii Tkachenko	720bed25bc	xds-k8s: Remove skips.version_lt(), and use only skips.version_gte() (#30124 ) A minor refactoring.	2 years ago
Easwar Swaminathan	e92469fe5a	xds-k8s: Bootstrap generator interop tests (#29954 ) Added a couple of tests which run the baseline_test with all released bootstrap generator versions on client and server. These tests will be run on a continuous integration environment with gRPC servers and clients built using the latest released version of gRPC in one selected language.	2 years ago
Michael Lumish	4f135e0e9e	Add supported Node version ranges in xDS k8s url_map tests (#29960 ) * Add supported Node version ranges in xDS k8s url_map tests This adds is_supported implementations for most of the url_map tests that didn't already have them. The exception is metadata_filter_test because it doesn't use any specific client features. * Fix formatting * Improve timeout test check order	2 years ago
Sergii Tkachenko	c27730218c	xds-k8s: handle missing edge case in TestConfig version comparison (#30030 ) Fixes incorrect `master` version handling in subsetting_test. Ref b/235825277	2 years ago
Sergii Tkachenko	ab1a23e64e	xds-k8s: add type hints to XdsKubernetesBaseTestCase (#30025 ) To make PyCharm happy.	2 years ago
Lidi Zheng	db684ad0dd	[xDS interop] Fix the affinity test support range (#29972 )	3 years ago
Lidi Zheng	2cae3827e6	[xDS interop] Disable api_listener_test for older Python versions (#29945 )	3 years ago
Sergii Tkachenko	165dda75f9	xds-k8s: Fix the issue with Java PSM security tests skipped (#29925 ) 1. Fixes the issue with Java PSM security tests accidentally skipped because Java was missing from the list of languages, ref https://github.com/grpc/grpc/pull/28978 2. Invert the logic of `is_supported` methods, making them normally open 3. Make languages an `enum.Flag` to avoid accidental typos when listing the languages 4. Rename `XdsKubernetesTestCase.isSupported` to `XdsKubernetesTestCase.is_supported` to be consistent with `XdsUrlMapTestCase.is_supported` 5. Add extra logging	3 years ago
Sergii Tkachenko	3d50a60296	xds-k8s: Split base test class to allow for non-isolated tests (#29921 ) Split `XdsKubernetesTestCase` into: - `XdsKubernetesTestCase` top-level base class containing flag parsing logic and common `assert*` methods - `XdsKubernetesIsolatedTestCase` extending `XdsKubernetesTestCase`, that is specific to tests that want to create ifra resources before each test, and destroy them after. Now tests that don't need to create/destroy all resources on each run, can extend `XdsKubernetesTestCase` without having to override all setUp and implementing other unnecessary methods.	3 years ago
Sergii Tkachenko	53ce85cc3b	xds-k8s: Output logs timezone in the beginning of the tests (#28865 )	3 years ago
Lidi Zheng	dd17960bc8	[xDS interop] Limit subsetting test to master branch (#29904 )	3 years ago
Sergii Tkachenko	32f1766cf5	xds-k8s README.md: improve basic setup docs (#29804 ) * reorder sections * recommend local-dev.cfg instead of grpc-testing.cfg to avoid confusion * clarify security tests	3 years ago
Sergii Tkachenko	8af28b83db	xds-k8s: Add whitespace after Logs Explorer link (#29863 ) To fix link parsers not recognizing the termination of the url, and including the next line.	3 years ago
Terry Wilson	887a605940	Allow interop tests to configure locality_lb_policies (#29688 ) * Allow tests to configure locality_lb_policies * Custom LB test	3 years ago
Sergii Tkachenko	d5c8bbce51	xds-k8s: Do not recommend enabling mesh certs by default (#29743 ) This should covered separately per this note: > For more details, and for the setup for security tests, see ["Setting up Traffic Director service security with proxyless gRPC"](https://cloud.google.com/traffic-director/docs/security-proxyless-setup) user guide.	3 years ago
Sergii Tkachenko	972374347b	[PSM interop] Double the operation timeout - misc (#29627 ) https://github.com/grpc/grpc/pull/29004 doubled resource timeouts for backends, but not networksecurity/networkservices resources.	3 years ago
Sergii Tkachenko	fa5598759c	xds-k8s readme minor fixes and improvements (#29578 )	3 years ago
Sergii Tkachenko	f36e84f093	xds-k8s: Fix incorrect type hint in the baseline test (#29577 )	3 years ago

1 2 3 4

168 Commits (04ddf3d0b73ab6bdf4f5e1566328ec54d654859b)