Chiebot-Mirror/grpc - grpc - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Alisha Nanda	b94fb2b894	Fix PickFirstTest.PendingUpdateAndSelectedSubchannelFails flake in client_lb_end2end_test (#30741 ) * ConnectionAttemptInjector: fix tsan failures * Add hold to test * Add hold to test * Address comments * Address comments * Address comments * Fix typo Co-authored-by: Mark D. Roth <roth@google.com>	2 years ago
Sergii Tkachenko	1837dae106	xDS interop: custom before_sleep_log only logging primitive returns (#30757 ) When we use retryers with `log_level=logging.INFO`, tenacity logs the result value (or an exception) after each unsuccessful retry attempt. We often retry methods that return objects, resulting in unreadable log messages: ``` I0820 03:16:29.027635 140613877811008 before_sleep.py:45] Retrying framework.xds_k8s_testcase.IsolatedXdsKubernetesTestCase.cleanup in 10.0 seconds as it raised RetryError: RetryError[Attempts: 21, Value: {'api_version': 'v1', 'kind': 'Namespace', 'metadata': {'annotations': None, 'cluster_name': None, 'creation_timestamp': datetime.datetime(2022, 8, 20, 2, 55, 32, tzinfo=tzlocal()), 'deletion_grace_period_seconds': None, 'deletion_timestamp': datetime.datetime(2022, 8, 20, 3, 6, 27, tzinfo=tzlocal()), 'finalizers': None, 'generate_name': None, 'generation': None, 'labels': {'kubernetes.io/metadata.name': 'psm-interop-server-20220820-0253-yrmam', 'name': 'psm-interop-server-20220820-0253-yrmam', 'owner': 'xds-k8s-interop-test'}, 'managed_fields': [{'api_version': 'v1', 'fields_type': 'FieldsV1', 'fields_v1': {'f:metadata': {'f:labels': {'.': {}, 'f:kubernetes.io/metadata.name': {}, ... (82 more lines) ``` This PR introduces custom `before_sleep` logger, that only logs the value if it's a primitive: `int, str, bool`. Otherwise, it logs the type, example: ``` k8s_base_runner.py:311] Waiting for pod psm-grpc-client-5d5648478f-7vsf7 to start retryers.py:192] Retrying framework.infrastructure.k8s.KubernetesNamespace.get_pod in 1.0 seconds as it returned type <class 'kubernetes.client.models.v1_pod.V1Pod'>. retryers.py:192] Retrying framework.infrastructure.k8s.KubernetesNamespace.get_pod in 1.0 seconds as it returned type <class 'kubernetes.client.models.v1_pod.V1Pod'>. ``` Note that this only changes the behavior of the unsuccessful retries, and doesn't affect the new feature that prints formatted k8s status field on if the final retry attempt failed.	2 years ago
Sergii Tkachenko	1e42fbeb52	xDS interop: enable pod log collection in the buildscripts (#30735 ) - Enables pod log collection in all PSM interop jobs implemented in https://github.com/grpc/grpc/pull/30594. - Associate test suite runs with their own log file, so it's displayed on "Target Log" tab	2 years ago
Sergii Tkachenko	1107617282	xDS interop: collect pod logs (#30594 ) - Added support for pod log collection. To enable, set `--collect_app_logs` flag, and specify `--log_dir`. - Added support and helpers for operating on the `--log_dir` (natively provided by absl) - Added support for `--follow` to `bin/run_test_server.py` and `bin/run_test_client.py` to follow pod logs printed to stdout - Moved `PortForwarder` from k8s.py to its own file The collection itself will be enabled per-suite in https://github.com/grpc/grpc/pull/30735.	2 years ago
Michael Lumish	9519fdc956	Enable prod outlier detection interop tests for C++ and Python (#30624 )	2 years ago
Craig Tiller	5b6dac02ac	[stats] Cleanup & re-enable stats system (#30610 ) * [stats] Cleanup stats system * clear out optionality * fix * might as well... * Automated change: Fix sanity tests * clean out more unused stuff * clean out more unused stuff * Automated change: Fix sanity tests Co-authored-by: ctiller <ctiller@users.noreply.github.com>	2 years ago
Craig Tiller	b8fde2ab47	Revert "GCP Observability: Add plugin registry API (#30571 )" (#30765 ) This reverts commit `486710317f`.	2 years ago
Yash Tibrewal	486710317f	GCP Observability: Add plugin registry API (#30571 ) * GCP Observability: Add plugin registry API * Restrict visibility for now * Move GcpObservability to its own thing * Reviewer comments	2 years ago
Mark D. Roth	5f7096614a	xds_e2e_test_lib: increase default timeouts in test framework (#30756 ) * e2e tests: add test scaling factor to durations in channel args * apply test scaling factor when encoding durations in xDS protos * apply scaling factor in fixed timeout in WaitForNack() * fix overflow * clang-format * adjust timeouts in fault injection tests * add missing slowdown factor * clang-format * xds_e2e_test_lib: increase default timeouts in test framework	2 years ago
Yash Tibrewal	b6a53145e9	HTTP2: Keepalive time logging (#30753 ) * HTTP2: Keepalive time logging * Adding the peer string too	2 years ago
Mark D. Roth	121a08f6a9	end2end tests: apply test slowdown factor in various places where it was missed (#30749 ) * e2e tests: add test scaling factor to durations in channel args * apply test scaling factor when encoding durations in xDS protos * apply scaling factor in fixed timeout in WaitForNack() * fix overflow * clang-format * adjust timeouts in fault injection tests * add missing slowdown factor * clang-format * add tests for duration multiplication	2 years ago
apolcyn	8b941bb648	Drop support for ruby 2.5 (#30699 ) * Drop ruby 2.5 support	2 years ago
Sergii Tkachenko	0932b426f1	xDS interop: Fix default resource prefix (#30754 ) * xDS interop: Fix default resource prefix No longer just security tests. This is done to avoid confusion when debugging resources managed by the LB tests. * s/xds/psm	2 years ago
Mark D. Roth	143c852d2f	end2end tests: fix test service impl to apply test slowdown factor (#30750 ) * end2end tests: fix test service impl to apply test slowdown factor * fix build	2 years ago
Yousuk Seung	807e93f250	[fixit] Deflake xds_outlier_detection_end2end_test (#30690 ) * timing deflake * Removed empty lines, unused setter. * Shorten sleep time, removed 3 sleeps. * hardcode sleep times everywhere * add back test factor * sanity check fix	2 years ago
Craig Tiller	a88181a778	[fixit] Fix internal fork_test flakiness (#30748 )	2 years ago
Mark D. Roth	aed25cb34a	priority LB: don't set child picker to null when failover timer fires (#30737 )	2 years ago
Yash Tibrewal	2805b523d9	XdsRingHash: Tune timeouts (#30742 ) * XdsRingHash: Tune timeouts * Add WaitForBackendOptions timeout * More tuning * Fix	2 years ago
Sergii Tkachenko	283e9665f1	xDS interop: s/server_image_universal/server_image_canonical/g (#30740 )	2 years ago
Vignesh Babu	09558e9052	Adjust rpc timeouts in xds tests to reduce Deadline exceeded errors in msan (#30732 )	2 years ago
Sergii Tkachenko	a3b535dd58	xDS interop: fix the order of rpc-behavior operations (#30739 ) To account for the API change made in https://github.com/grpc/grpc-java/pull/9171.	2 years ago
Mark D. Roth	03b6b01043	ConnectionAttemptInjector: fix tsan failures (#30730 )	2 years ago
AJ Heller	24bc7c455f	Match the greeter async example to the tutorial (#30731 ) The "Prepare" bit is unnecessary for the basic async example. It's used more meaningfully in the v2 async greeter implementations.	2 years ago
Ming-Chuan	e636213f88	Binder transport: Log endpoint binder object before passing it to client (#29555 ) This will help us identify connection establishment related issue.	2 years ago
Mark D. Roth	4a27b432b6	xds_cluster_e2e_test: change tests to provide better failure messages (#30727 )	2 years ago
Mark D. Roth	dc1cb1fb59	grpclb_e2e_test: increase timeout in InitiallyEmptyServerlist test (#30726 )	2 years ago
Mark D. Roth	ee900c0e39	ring_hash: fix subchannel list to not shutdown until picker is destroyed (#30714 ) * ring_hash: fix subchannel list to not shutdown until picker is destroyed * hop into WorkSerializer before unreffing subchannels * use a weak ref for subchannel connectivity state watches * Automated change: Fix sanity tests * fix memory leak * clang-format * fix circular reference problem by moving ring into subchannel list * Automated change: Fix sanity tests Co-authored-by: markdroth <markdroth@users.noreply.github.com>	2 years ago
Craig Tiller	ca7c17c8d4	[fixit] Make the sampling fuzzer always pass for now (#30718 ) * [fixit] Make the sampling fuzzer always pass for now * Update sample_fuzzers.sh	2 years ago
Sergii Tkachenko	666ea7cd21	xDS interop: fix an issue with secondary zone namespaces not cleaned (#30717 ) All alternative server runners except the failover test reuse the primary server runners' namespace. Failover test is using the secondary cluster, and manages its own namespace there. `reuse_namespace` disables namespace cleanup, and in this case it was set to `True` incorrectly.	2 years ago
Craig Tiller	93fb6add2a	[fixit] Reduce the size of this benchmark under expensive sanitizers (#30715 )	2 years ago
AJ Heller	f7d8ee068a	[fixit] Extend timeout for SameBackendListedMultipleTimes/V3 test (#30716 ) Previously this failed 1/1000 times with a 1s timeout, giving a `Deadline Exceeded` error. I was able to reproduce the failure in 22/1000 times with a 500ms timeout. Changing it to a 2s timeout in this PR, the failure did not reproduce in 5000 runs.	2 years ago
Jan Tattermusch	892320ad0e	warn about kokoro worker without mounted /tmpfs. (#30711 ) Also allow debugging of available disk space.	2 years ago
Craig Tiller	a6d67ab6db	[fixit] Disable ub/msan on all qps, tsan on some qps tests (#30713 )	2 years ago
Craig Tiller	70d9ccf576	[chttp2] Handle no authority in channel args without crashing (#30706 )	2 years ago
Cheng-Yu Chung	7c86c34e63	[fixit] Solve the flakiness for test case `End2endTest.ClientCancelsBidi` (#30664 ) * First try to solve the flakiness of End2endTest.ClientCancelsBidi * Update using `absl::Notification` * Update	2 years ago
Craig Tiller	f47e339d81	[zlib] Remove dependency on zlib version (#30704 )	2 years ago
Craig Tiller	e2186fd9e5	[fixit] Increase timeout, fix atomicity (#30671 ) * Increase timeout * fix atomicity * windows fix * Update tls_utils.cc	2 years ago
Sergii Tkachenko	5abe970123	xDS interop: Improve retry logic and logging for the k8s retry operations (#30607 ) - Changes the order of waiting for pods to start: wait for the pods first, then for the deployment to transition to active. This should provide more useful information in the logs, showing exactly why the pod didn't start, instead of generic "Replicas not available" ref b/200293121. This also needed for https://github.com/grpc/grpc/pull/30594 - Add support for `check_result` callback in the retryer helpers - Completely replaces `retrying` with `tenacity`, ref b/200293121. Retrying is not longer maintained. - Improves the readability of timeout errors: now they contain the timeout (or the attempt number) exceeded, and information why the timeout failed (exception/check function): Before: > `tenacity.RetryError: RetryError[<Future at 0x7f8ce156bc18 state=finished returned dict>]` After: > `framework.helpers.retryers.RetryError: Retry error calling framework.infrastructure.k8s.KubernetesNamespace.get_pod: timeout 0:01:00 exceeded. Check result callback returned False.` - Improves the readability of the k8s wait operation errors: now the log includes colorized and formatted status of the k8s object being watched, instead of dumping the full k8s object. For example, here's how an error caused by using incorrect TD bootstrap image:	2 years ago
AJ Heller	4c2aa29b13	[fixit] Add source location to CqVerifier output (#30702 ) * [fixit] Add source location to CqVerifier output Supersedes #30701 * Automated change: Fix sanity tests Co-authored-by: drfloob <drfloob@users.noreply.github.com>	2 years ago
Craig Tiller	a9d4bc4cf9	[fixit] Scale down large tests (#30673 ) We have many tests that create 100 threads or more, and mounting evidence that this is harmful to our CI environment. When the original code for many of these tests was written we ran our tests under run_tests, which had explicit handling for tracking the number of threads each test needed and making sure that we weren't over subscribing the test runner. Bazel has no such facility (and the facility in run_tests has since been removed) and so we need to adjust. This PR adjusts down a single test and is part of a series so that we can review and roll back easily if required.	2 years ago
Craig Tiller	f133d81714	[fixit] Scale down large tests (#30676 ) We have many tests that create 100 threads or more, and mounting evidence that this is harmful to our CI environment. When the original code for many of these tests was written we ran our tests under run_tests, which had explicit handling for tracking the number of threads each test needed and making sure that we weren't over subscribing the test runner. Bazel has no such facility (and the facility in run_tests has since been removed) and so we need to adjust. This PR adjusts down a single test and is part of a series so that we can review and roll back easily if required.	2 years ago
Craig Tiller	966b58edb7	[fixit] Scale down large tests (#30679 ) We have many tests that create 100 threads or more, and mounting evidence that this is harmful to our CI environment. When the original code for many of these tests was written we ran our tests under run_tests, which had explicit handling for tracking the number of threads each test needed and making sure that we weren't over subscribing the test runner. Bazel has no such facility (and the facility in run_tests has since been removed) and so we need to adjust. This PR adjusts down a single test and is part of a series so that we can review and roll back easily if required.	2 years ago
Craig Tiller	5a5ff826cf	[fixit] Scale down large tests (#30685 ) We have many tests that create 100 threads or more, and mounting evidence that this is harmful to our CI environment. When the original code for many of these tests was written we ran our tests under run_tests, which had explicit handling for tracking the number of threads each test needed and making sure that we weren't over subscribing the test runner. Bazel has no such facility (and the facility in run_tests has since been removed) and so we need to adjust. This PR adjusts down a single test and is part of a series so that we can review and roll back easily if required.	2 years ago
Yash Tibrewal	3e21838aff	[fixit] Tune max_connection_age grace period (#30703 )	2 years ago
Sergii Tkachenko	0a150b246b	Enable outlier detection k8s interop test for Java >= 1.49.x (#30668 ) * Enable outlier detection k8s interop test for Java. (#30641) * xDS interop: enable outlier detection Java tests in >= 1.49.x Co-authored-by: Terry Wilson <terrymwilson@gmail.com>	2 years ago
Mark D. Roth	72e76f6a86	client_lb_e2e_test: fix flake in PickFirstTest.CheckStateBeforeStartWatch (#30698 )	2 years ago
Craig Tiller	cca2dcd9e9	[fixit] Scale down large tests (#30682 ) We have many tests that create 100 threads or more, and mounting evidence that this is harmful to our CI environment. When the original code for many of these tests was written we ran our tests under run_tests, which had explicit handling for tracking the number of threads each test needed and making sure that we weren't over subscribing the test runner. Bazel has no such facility (and the facility in run_tests has since been removed) and so we need to adjust. This PR adjusts down a single test and is part of a series so that we can review and roll back easily if required.	2 years ago
Craig Tiller	a1dc97e9df	[fixit] Annotate stats_test cpu requirements (#30693 ) * fix * Update BUILD	2 years ago
Craig Tiller	406d518742	Revert "4 cpus is the max (#30696 )" (#30697 ) This reverts commit `75c780039b`.	2 years ago
Craig Tiller	708259dc90	[fixit] Scale down large tests (#30683 ) * [fixit] Scale down large tests We have many tests that create 100 threads or more, and mounting evidence that this is harmful to our CI environment. When the original code for many of these tests was written we ran our tests under run_tests, which had explicit handling for tracking the number of threads each test needed and making sure that we weren't over subscribing the test runner. Bazel has no such facility (and the facility in run_tests has since been removed) and so we need to adjust. This PR adjusts down a single test and is part of a series so that we can review and roll back easily if required. * mark up cpu usage * Automated change: Fix sanity tests Co-authored-by: ctiller <ctiller@users.noreply.github.com>	2 years ago

... 4 5 6 7 8 ...

51844 Commits (34407bf551bf3507e6aefd40f2f553bc3443caa2) All Branches Search

51844 Commits (34407bf551bf3507e6aefd40f2f553bc3443caa2)

All Branches