This test is flaky, example failure: https://fusion2.corp.google.com/invocations/31bcd70f-650d-40be-8ec4-6fafc3cd3724, error message:
```
Traceback (most recent call last):
File "/usr/lib/python3.9/unittest/case.py", line 59, in testPartExecutor
yield
File "/usr/lib/python3.9/unittest/case.py", line 593, in run
self._callTestMethod(testMethod)
File "/usr/lib/python3.9/unittest/case.py", line 550, in _callTestMethod
method()
File "/var/local/git/grpc/src/python/grpcio_tests/tests_aio/unit/_test_base.py", line 31, in wrapper
return loop.run_until_complete(f(*args, **kwargs))
File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/var/local/git/grpc/src/python/grpcio_tests/tests_aio/unit/call_test.py", line 820, in test_cancel_after_done_writing
self.assertFalse(call.done())
File "/usr/lib/python3.9/unittest/case.py", line 676, in assertFalse
raise self.failureException(msg)
AssertionError: True is not false
```
We're expecting call to NOT finish immediately after `await call.done_writing()` which is not correct.
* `call.done_writing()` only notifies server that the client is done sending messages, while `call.done()` checks if the RPC call itself has finished.
Thus removing assert check for `call.done()`.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37651
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37651 from XuanWang-Amos:fix_aio_cancel_after_done_writing 42add84236
PiperOrigin-RevId: 671834305
This test is flaky for a while now, most of the time it failed with timeout error, one [example](https://btx.cloud.google.com/invocations/c31650ae-afdd-4e50-bfc4-b9a43588d3ba/log):
```
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", line 60, in testPartExecutor
yield
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", line 676, in run
self._callTestMethod(testMethod)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", line 633, in _callTestMethod
method()
File "/Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_python_macos_opt_native/src/python/grpcio_tests/tests_aio/unit/_test_base.py", line 31, in wrapper
return loop.run_until_complete(f(*args, **kwargs))
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_python_macos_opt_native/src/python/grpcio_tests/tests_aio/unit/connectivity_test.py", line 54, in test_unavailable_backend
await asyncio.wait_for(
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/tasks.py", line 501, in wait_for
raise exceptions.TimeoutError()
asyncio.exceptions.TimeoutError
```
Suspect the root cause is the same mentioned here: https://github.com/grpc/grpc/pull/26409
Instead of disable this for MacOS, changing the timeout from `4s` to `8s` seems reasonable too.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37645
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37645 from XuanWang-Amos:fix_aio_connectivity_test 10fdf90635
PiperOrigin-RevId: 671523025
This is flaky in CI, and is being replaced by the new implementation with event engine, disabling this test.
Will clean them up after all have switched to event engine.
Closes#37636
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37636 from HannahShiSFB:disable-iomgr-cfstream-test b430ee3d7c
PiperOrigin-RevId: 671399515
Fix some flakiness caused by fixing jitter in the backoff library in #37595.
In the xDS retry tests, the additional jitter made it such that an extra attempt snuck in before the call deadline, so I adjusted the knobs to ensure that exactly the expected number of attempts fit into the tests.
In the PF test, I rewrote the test to use a connection injector, so that it can more accurately tell the time between the connection attempts, without seeing skew due to the server startup time.
Closes#37629
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37629 from markdroth:xds_retry_backoff_flake_fix f35ac902a0
PiperOrigin-RevId: 671120341
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37632
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37632 from yijiem:alts-concurrent-connectivity-test-flake 68df4b1327
PiperOrigin-RevId: 671117730
Looks like there are some odd interactions, but call-v3 doesn't (and will never) handle wakeup sets, so disable for now until iomgr is removed.
Closes#37630
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37630 from ctiller:cgg 7c37893667
PiperOrigin-RevId: 671104484
Fallback interop test is fully deployed. This variable is no longer needed.
Closes#37620
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37620 from eugeneo:no-fallback-var c21509d0a5
PiperOrigin-RevId: 670738146
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37572
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37572 from yijiem:upgrade-clang-7 03f8cdc54e
PiperOrigin-RevId: 670692896
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37572
PiperOrigin-RevId: 670631284
We have base images that still need to be migrated (for instance, from marketplace.gcr.io). This change restores support for GCR, to be maintained until all images are migrated.
Closes#37596
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37596 from paulosjca:ar 9c4c91b9bb
PiperOrigin-RevId: 669396634
Final piece of gRFC A83 (https://github.com/grpc/proposal/pull/438): the GCP authentication filter itself.
Infrastructure changes include:
- Added a general-purpose LRU cache library that can be reused elsewhere.
- Fixed the client channel code to use the channel args returned by the resolver for the dynamic filters. This was necessary so that the GCP auth filter could access the `XdsConfig` object, which is passed via a channel arg.
- Unlike the other xDS HTTP filters we support, the GCP auth filter does not support config overrides, and its configuration includes a cache size parameter that we always need at the channel level, not per-call. As a result, I had to change the xDS HTTP filter API to give it the ability to set top-level fields in the service config, not just per-method fields. (We use the service config as a way of passing configuration down into xDS HTTP filters.) Note that for now, this works only on the client side, because we don't have machinery for a top-level service config on the server side.
- The GCP auth filter is also the first case where the filter needs to know its instance name from the xDS config, so I changed the xDS HTTP filter API to plumb that through.
- Fixed a bug in the HTTP client library that prevented the override functions from declining to override a particular request.
Closes#37550
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37550 from markdroth:xds_gcp_auth_filter 19eaefb52f
PiperOrigin-RevId: 669371249
[Gpr_To_Absl_Logging] Adding comments to experimental absl wrappers .
These will give additional context of why these were needed and when they should be removed.
Closes#37597
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37597 from tanvi-jagtap:add_comment 7dc38b78e9
PiperOrigin-RevId: 669197397
Fixes a bug in the backoff implementation whereby we were incorrectly failing to apply jitter to the initial backoff.
Also change the API to return `Duration` instead of `Timestamp`. The only caller that actually wants to count the backoff from the start of the previous attempt instead of the end of the previous attempt is the subchannel code, and it handles that on its end.
Closes#37595
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37595 from markdroth:backoff_fixes_and_api_improvement 39d083c0f4
PiperOrigin-RevId: 669112557
[Gpr_To_Absl_Logging] Remove gpr_log. Adding absl LOG wrappers
List of changes in this PR
1. Replacing all instances of gpr_log in PHP and RUBY with the new absl wrapper APIs. The replacement mapping is given below
gpr_log(GPR_ERROR, ...)
=> grpc_absl_log_error
gpr_log(GPR_INFO, ...)
=> grpc_absl_log_info - Printing a simple message
=> grpc_absl_log_info_int - Printing a message and a number
=> grpc_absl_log_info_str - Printing 2 strings.
gpr_log(GPR_DEBUG, ...)
=> grpc_absl_vlog - Printing a simple message
=> grpc_absl_vlog_int - Printing a message and a number
=> grpc_absl_vlog_str - Printing 2 strings.
Adding grpc_absl_vlog2_enabled() check around gpr_log(GPR_DEBUG, ...)
2. src/python/grpcio_observability/grpc_observability/observability_util.cc One instance of gpr_log to absl LOG replacement was missed earlier. Fixing that.
3. Deleting deprecated gpr stuff : gpr_log_severity , GPR_DEBUG , GPR_INFO , GPR_ERROR , gpr_log .
4. Adding new APIs for Ruby and PHP. These APIs are very simple wrappers around absl.
5. Removing the legacy functions in platform specific log.cc files. These files are safe to delete now.
6. Fixing the allow list in banned_functions.py . This makes sure that these new wrappers don't get used all over the place by everyone. We carefully only allow list the PHP and RUBY files and allow the use of these wrappers. Everywhere else - using these wrappers should fail Sanity Tests.
Closes#37431
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37431 from tanvi-jagtap:remove_gpr_error 6e5e9bcfcc
PiperOrigin-RevId: 668586873
These changes are compatible with WORKSPACE.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
gcc 7 and 8 docker images are based on Debian 10 with older cmake than gRPC needs. Since it's not easy to change the base Debian image for these docker images, let's install a newer cmake instead. This is a left-over from https://github.com/grpc/grpc/pull/37547.
Closes#37571
PiperOrigin-RevId: 668560102
Add validation of the `Audience` cluster metadata type, as per gRFC A83 (https://github.com/grpc/proposal/pull/438).
I had previously changed the metadata to be represented as JSON in #37468. However, while working on the GCP Authentication filter implementation, I realized that that's not an ideal representation, because it would have required us to validate the JSON on a per-RPC basis, which would be bad for performance. So I've changed the representation of metadata to be an abstract type, and we now store the `Audience` metadata as a simple string. I've also moved metadata into its own type with its own validation code, so that in the future we can use it in places other than CDS (many xDS resource types have metadata fields).
While I was at it, I also add some helper functions for validating the `UInt32Value` and `UInt64Value` wrapper protos.
Closes#37566
PiperOrigin-RevId: 668281729
Artifact Registry requires the registry as an argument for `gcloud auth configure-docker`.
Closes#37576
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37576 from paulosjca:ar 1ab668fef1
PiperOrigin-RevId: 667769875
The first commit is a pure revert of the revert, and the second one has the fix.
Closes#37573
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37573 from markdroth:call_creds_roll_forward 2476329534
PiperOrigin-RevId: 667672832
The changes in #37531 are causing test failures under run_tests.py (but not bazel), and #37544 was built on top of #37531, so both need to be reverted.
Closes#37567
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37567 from markdroth:call_creds_revert d086e066f5
PiperOrigin-RevId: 666978406
This will fix timestamps on logs and show all `VLOG(2)` logs on tests by default.
Currently, timestamps on logs are shown as -
```
I0000 00:00:1724385276.681936 1894892 config.cc:262] gRPC experiments enabled: call_tracer_in_transport, event_engine_dns, event_engine_listener, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache
```
After invoking `absl::InitializeLog()`, this gets fixed to -
```
I0823 03:55:53.993928 1895644 config.cc:262] gRPC experiments enabled: call_tracer_in_transport, event_engine_dns, event_engine_listener, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache
```
Closes#37560
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37560 from yashykt:ImproveLoggingForTests 66433336c8
PiperOrigin-RevId: 666956421
As per gRFC A83 (https://github.com/grpc/proposal/pull/438).
For now, I am not exposing this new call creds type via the C-core API or in any C++ or wrapped language public APIs, so there's no way to use it externally. We can easily add that in the future if someone asks, but for now the intent is to use it only internally via the xDS GCP authentication filter, which I'll implement in a subsequent PR.
As part of this, I changed the test framework in credentials_test to check the status code in addition to the message on failure. This exposed several places where existing credential types are returnign the wrong status code (unsurprisingly, because of all of the tech debt surrounding grpc_error). I have not fixed this behavior, but I have added TODOs in the test showing which ones I think need to be fixed.
Closes#37544
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37544 from markdroth:gcp_service_account_identity_call_creds 97e0efc48d
PiperOrigin-RevId: 666869692
This adds functionality that is intended to be used for the new GcpServiceAccountIdentityCallCredentials implementation, as per gRFC A83 (https://github.com/grpc/proposal/pull/438). However, it is also a useful improvement for all token-fetching call credentials types, so I am adding it to the base class.
Closes#37531
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37531 from markdroth:token_fetcher_call_creds_prefetch_and_backoff 0fcdb48465
PiperOrigin-RevId: 666809903
This change moves the location for the prebuilt images used in gRPC OSS benchmarks from GCR to Artifact Registry.
Images are all saved to the same repo, instead of being organized by a prefix (`kokoro`, `kokoro-test`,
`cxx_experiment/kokoro-test`, etc.).
Closes#37558
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37558 from paulosjca:ar a767ab5d91
PiperOrigin-RevId: 666643972
Previously, `grpc_oauth2_token_fetcher_credentials` provided functionality for on-demand token-fetching, but it was integrated into the oauth2 code, so it was not possible to use that same code for on-demand fetching of (e.g.) JWT tokens. This PR splits that class into two parts:
1. A base `TokenFetcherCredentials` class that provides a framework for on-demand fetching of any arbitrary type of auth token.
2. An `Oauth2TokenFetcherCredentials` subclass that derives from `TokenFetcherCredentials` and provides handling for oauth2 tokens.
The `grpc_compute_engine_token_fetcher_credentials`, `StsTokenFetcherCredentials`, and `grpc_google_refresh_token_credentials` classes that previously derived from `grpc_oauth2_token_fetcher_credentials` now derive from `Oauth2TokenFetcherCredentials` instead, so there's not much change to those classes (other than a cleaner interface with the base class functionality).
The `ExternalAccountCredentials` class and its subclasses got more extensive changes here. Previously, this class inheritted from `grpc_oauth2_token_fetcher_credentials` and fooled the base class into thinking that it directly fetched the oauth2 token, when in fact it actually performed a number of steps to gather data and then constructed a synthetic HTTP response to pass back to the base class. I have changed this to instead derive directly from `TokenFetcherCredentials` to provide a much cleaner interface with the parent class.
In addition, I have changed `grpc_call_credentials` from `RefCounted<>` to `DualRefCounted<>` to provide a clean way to shut down any in-flight token fetch when the credentials are unreffed.
This PR paves the way for subsequent work that will allow implementing an on-demand JWT token fetcher call credential, as part of gRFC A83 (https://github.com/grpc/proposal/pull/438).
Closes#37510
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37510 from markdroth:token_fetcher_call_creds_refactor 3bd398a762
PiperOrigin-RevId: 666547985
I observed a case where `epoll_wait` would hang after receiving all available bytes (209 total). Note that all epoll events are configured to be edge-triggered (EPOLLET). The message was an HTTP request where the the HTTP headers indicated the connection would be closed. Presumably the peer would close their end of the connection, but the single epoll event did not contain EPOLLHUP. Regardless, the entire payload could be consumed in one read.
*Without* the TCP_INQ logic, the gRPC endpoint assumed that there was more data to read after the first read of 209 bytes. Since there was sufficient buffer space to read again, the endpoint called `recvmsg` a second time and immediately got EAGAIN. This triggered the downstream code to process the received data. Subsequent endpoint Reads would attempt a `recvmsg` first, and only if that failed, would then opt in to be notified of epoll events on their watched socket.
However, the INQ logic caused a nearly constant hang in this one environment, but non-deterministically. After a successful recvmsg (i.e., some bytes were received without any errors), INQ accurately showed there were no more bytes to read, and `recvmsg` was not called again, so the edge was never seen. The data was returned to the caller and the read was complete. The next endpoint Read would *first* ask to be notified of fd events, since the previous read claimed there was nothing left to read on that socket (this is the bug). Due to the nature of EPOLLET[1], the fd was very likely still in a ready state from the previous read, since it didn't call `recvmsg` until the edge was found. So the subsequent read was stuck waiting on a socket change event that may not come.
This fix resets the endpoint's state after consuming all available bytes according to TCP_INQ. This ensures that subsequent Reads will first attempt a read on the fd before requesting to be notified of new fd events.
[1]: https://www.man7.org/linux/man-pages/man7/epoll.7.html
> Receiving an event from epoll_wait(2) should suggest to you that such file descriptor is ready for the requested I/O operation. You must consider it ready until the next (nonblocking) read/write yields EAGAIN.
PiperOrigin-RevId: 666398394