Pretty sure that the `LoadBalancedCall` instance is getting dropped before return on *some* path through `PickSubchannel`, although it's unclear to me which and it's hard to tell with this implementation.
Ensure that such an event cannot cause a crash by holding a ref to the object we need and calling through that.
This will be marginally worse performance per pick for now, but once work serializer dispatch lands everywhere the additional ref will disappear.
Closes#37663
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37663 from ctiller:flake-fightas-3 d12b2e0540
PiperOrigin-RevId: 673088001
As title.
Bump core version since there were API changes (to logging APIs) since the last release.
Closes#37661
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37661 from apolcyn:bump_core_version_202409092234 190b88342b
PiperOrigin-RevId: 672984841
This change adds an experiment to move time caching from `ExecCtx` (which is the wrong place for this mechanism) and moves it to the party update path (the expectation being that a single poll of a call is the granularity at which we expect time caching to be a useful optimization, whilst avoiding the unbounded hold times associated with the current mechanism).
This requires fixing up a few tests that grew to depend on time caching (would appreciate close eyes on the credentials test, as it's unclear to me why this is required or what the effect is).
This should also fix b/232544809.
Closes#37637
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37637 from ctiller:closer-to-the-sun 8bbde2d0bd
PiperOrigin-RevId: 672574762
This test is flaky, example failure: https://fusion2.corp.google.com/invocations/31bcd70f-650d-40be-8ec4-6fafc3cd3724, error message:
```
Traceback (most recent call last):
File "/usr/lib/python3.9/unittest/case.py", line 59, in testPartExecutor
yield
File "/usr/lib/python3.9/unittest/case.py", line 593, in run
self._callTestMethod(testMethod)
File "/usr/lib/python3.9/unittest/case.py", line 550, in _callTestMethod
method()
File "/var/local/git/grpc/src/python/grpcio_tests/tests_aio/unit/_test_base.py", line 31, in wrapper
return loop.run_until_complete(f(*args, **kwargs))
File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/var/local/git/grpc/src/python/grpcio_tests/tests_aio/unit/call_test.py", line 820, in test_cancel_after_done_writing
self.assertFalse(call.done())
File "/usr/lib/python3.9/unittest/case.py", line 676, in assertFalse
raise self.failureException(msg)
AssertionError: True is not false
```
We're expecting call to NOT finish immediately after `await call.done_writing()` which is not correct.
* `call.done_writing()` only notifies server that the client is done sending messages, while `call.done()` checks if the RPC call itself has finished.
Thus removing assert check for `call.done()`.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37651
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37651 from XuanWang-Amos:fix_aio_cancel_after_done_writing 42add84236
PiperOrigin-RevId: 671834305
This test is flaky for a while now, most of the time it failed with timeout error, one [example](https://btx.cloud.google.com/invocations/c31650ae-afdd-4e50-bfc4-b9a43588d3ba/log):
```
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", line 60, in testPartExecutor
yield
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", line 676, in run
self._callTestMethod(testMethod)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", line 633, in _callTestMethod
method()
File "/Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_python_macos_opt_native/src/python/grpcio_tests/tests_aio/unit/_test_base.py", line 31, in wrapper
return loop.run_until_complete(f(*args, **kwargs))
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_python_macos_opt_native/src/python/grpcio_tests/tests_aio/unit/connectivity_test.py", line 54, in test_unavailable_backend
await asyncio.wait_for(
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/tasks.py", line 501, in wait_for
raise exceptions.TimeoutError()
asyncio.exceptions.TimeoutError
```
Suspect the root cause is the same mentioned here: https://github.com/grpc/grpc/pull/26409
Instead of disable this for MacOS, changing the timeout from `4s` to `8s` seems reasonable too.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37645
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37645 from XuanWang-Amos:fix_aio_connectivity_test 10fdf90635
PiperOrigin-RevId: 671523025
This is flaky in CI, and is being replaced by the new implementation with event engine, disabling this test.
Will clean them up after all have switched to event engine.
Closes#37636
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37636 from HannahShiSFB:disable-iomgr-cfstream-test b430ee3d7c
PiperOrigin-RevId: 671399515
Fix some flakiness caused by fixing jitter in the backoff library in #37595.
In the xDS retry tests, the additional jitter made it such that an extra attempt snuck in before the call deadline, so I adjusted the knobs to ensure that exactly the expected number of attempts fit into the tests.
In the PF test, I rewrote the test to use a connection injector, so that it can more accurately tell the time between the connection attempts, without seeing skew due to the server startup time.
Closes#37629
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37629 from markdroth:xds_retry_backoff_flake_fix f35ac902a0
PiperOrigin-RevId: 671120341
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37632
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37632 from yijiem:alts-concurrent-connectivity-test-flake 68df4b1327
PiperOrigin-RevId: 671117730
Looks like there are some odd interactions, but call-v3 doesn't (and will never) handle wakeup sets, so disable for now until iomgr is removed.
Closes#37630
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37630 from ctiller:cgg 7c37893667
PiperOrigin-RevId: 671104484
Fallback interop test is fully deployed. This variable is no longer needed.
Closes#37620
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37620 from eugeneo:no-fallback-var c21509d0a5
PiperOrigin-RevId: 670738146
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37572
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37572 from yijiem:upgrade-clang-7 03f8cdc54e
PiperOrigin-RevId: 670692896
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37572
PiperOrigin-RevId: 670631284
We have base images that still need to be migrated (for instance, from marketplace.gcr.io). This change restores support for GCR, to be maintained until all images are migrated.
Closes#37596
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37596 from paulosjca:ar 9c4c91b9bb
PiperOrigin-RevId: 669396634
Final piece of gRFC A83 (https://github.com/grpc/proposal/pull/438): the GCP authentication filter itself.
Infrastructure changes include:
- Added a general-purpose LRU cache library that can be reused elsewhere.
- Fixed the client channel code to use the channel args returned by the resolver for the dynamic filters. This was necessary so that the GCP auth filter could access the `XdsConfig` object, which is passed via a channel arg.
- Unlike the other xDS HTTP filters we support, the GCP auth filter does not support config overrides, and its configuration includes a cache size parameter that we always need at the channel level, not per-call. As a result, I had to change the xDS HTTP filter API to give it the ability to set top-level fields in the service config, not just per-method fields. (We use the service config as a way of passing configuration down into xDS HTTP filters.) Note that for now, this works only on the client side, because we don't have machinery for a top-level service config on the server side.
- The GCP auth filter is also the first case where the filter needs to know its instance name from the xDS config, so I changed the xDS HTTP filter API to plumb that through.
- Fixed a bug in the HTTP client library that prevented the override functions from declining to override a particular request.
Closes#37550
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37550 from markdroth:xds_gcp_auth_filter 19eaefb52f
PiperOrigin-RevId: 669371249
[Gpr_To_Absl_Logging] Adding comments to experimental absl wrappers .
These will give additional context of why these were needed and when they should be removed.
Closes#37597
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37597 from tanvi-jagtap:add_comment 7dc38b78e9
PiperOrigin-RevId: 669197397
Fixes a bug in the backoff implementation whereby we were incorrectly failing to apply jitter to the initial backoff.
Also change the API to return `Duration` instead of `Timestamp`. The only caller that actually wants to count the backoff from the start of the previous attempt instead of the end of the previous attempt is the subchannel code, and it handles that on its end.
Closes#37595
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37595 from markdroth:backoff_fixes_and_api_improvement 39d083c0f4
PiperOrigin-RevId: 669112557
[Gpr_To_Absl_Logging] Remove gpr_log. Adding absl LOG wrappers
List of changes in this PR
1. Replacing all instances of gpr_log in PHP and RUBY with the new absl wrapper APIs. The replacement mapping is given below
gpr_log(GPR_ERROR, ...)
=> grpc_absl_log_error
gpr_log(GPR_INFO, ...)
=> grpc_absl_log_info - Printing a simple message
=> grpc_absl_log_info_int - Printing a message and a number
=> grpc_absl_log_info_str - Printing 2 strings.
gpr_log(GPR_DEBUG, ...)
=> grpc_absl_vlog - Printing a simple message
=> grpc_absl_vlog_int - Printing a message and a number
=> grpc_absl_vlog_str - Printing 2 strings.
Adding grpc_absl_vlog2_enabled() check around gpr_log(GPR_DEBUG, ...)
2. src/python/grpcio_observability/grpc_observability/observability_util.cc One instance of gpr_log to absl LOG replacement was missed earlier. Fixing that.
3. Deleting deprecated gpr stuff : gpr_log_severity , GPR_DEBUG , GPR_INFO , GPR_ERROR , gpr_log .
4. Adding new APIs for Ruby and PHP. These APIs are very simple wrappers around absl.
5. Removing the legacy functions in platform specific log.cc files. These files are safe to delete now.
6. Fixing the allow list in banned_functions.py . This makes sure that these new wrappers don't get used all over the place by everyone. We carefully only allow list the PHP and RUBY files and allow the use of these wrappers. Everywhere else - using these wrappers should fail Sanity Tests.
Closes#37431
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37431 from tanvi-jagtap:remove_gpr_error 6e5e9bcfcc
PiperOrigin-RevId: 668586873
These changes are compatible with WORKSPACE.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
gcc 7 and 8 docker images are based on Debian 10 with older cmake than gRPC needs. Since it's not easy to change the base Debian image for these docker images, let's install a newer cmake instead. This is a left-over from https://github.com/grpc/grpc/pull/37547.
Closes#37571
PiperOrigin-RevId: 668560102
Add validation of the `Audience` cluster metadata type, as per gRFC A83 (https://github.com/grpc/proposal/pull/438).
I had previously changed the metadata to be represented as JSON in #37468. However, while working on the GCP Authentication filter implementation, I realized that that's not an ideal representation, because it would have required us to validate the JSON on a per-RPC basis, which would be bad for performance. So I've changed the representation of metadata to be an abstract type, and we now store the `Audience` metadata as a simple string. I've also moved metadata into its own type with its own validation code, so that in the future we can use it in places other than CDS (many xDS resource types have metadata fields).
While I was at it, I also add some helper functions for validating the `UInt32Value` and `UInt64Value` wrapper protos.
Closes#37566
PiperOrigin-RevId: 668281729
Artifact Registry requires the registry as an argument for `gcloud auth configure-docker`.
Closes#37576
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37576 from paulosjca:ar 1ab668fef1
PiperOrigin-RevId: 667769875
The first commit is a pure revert of the revert, and the second one has the fix.
Closes#37573
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37573 from markdroth:call_creds_roll_forward 2476329534
PiperOrigin-RevId: 667672832
The changes in #37531 are causing test failures under run_tests.py (but not bazel), and #37544 was built on top of #37531, so both need to be reverted.
Closes#37567
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37567 from markdroth:call_creds_revert d086e066f5
PiperOrigin-RevId: 666978406
This will fix timestamps on logs and show all `VLOG(2)` logs on tests by default.
Currently, timestamps on logs are shown as -
```
I0000 00:00:1724385276.681936 1894892 config.cc:262] gRPC experiments enabled: call_tracer_in_transport, event_engine_dns, event_engine_listener, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache
```
After invoking `absl::InitializeLog()`, this gets fixed to -
```
I0823 03:55:53.993928 1895644 config.cc:262] gRPC experiments enabled: call_tracer_in_transport, event_engine_dns, event_engine_listener, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache
```
Closes#37560
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37560 from yashykt:ImproveLoggingForTests 66433336c8
PiperOrigin-RevId: 666956421
As per gRFC A83 (https://github.com/grpc/proposal/pull/438).
For now, I am not exposing this new call creds type via the C-core API or in any C++ or wrapped language public APIs, so there's no way to use it externally. We can easily add that in the future if someone asks, but for now the intent is to use it only internally via the xDS GCP authentication filter, which I'll implement in a subsequent PR.
As part of this, I changed the test framework in credentials_test to check the status code in addition to the message on failure. This exposed several places where existing credential types are returnign the wrong status code (unsurprisingly, because of all of the tech debt surrounding grpc_error). I have not fixed this behavior, but I have added TODOs in the test showing which ones I think need to be fixed.
Closes#37544
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37544 from markdroth:gcp_service_account_identity_call_creds 97e0efc48d
PiperOrigin-RevId: 666869692