I'm continuing to look into some flakes here, but in the meantime these shouldn't halt submissions. Marking them flaky.
Closes#37880
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37880 from ctiller:mark-flaky 27427c7978
PiperOrigin-RevId: 684526341
I'm unable to reproduce some of the flakiness here. Enabling tracers to get more information.
Closes#37875
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37875 from yashykt:MoreLogsInXdsE2ETest 4a7eb67202
PiperOrigin-RevId: 684201791
With the CL-first approach, the docker test configs for Binder need to be deleted before the Binder code and tests themselves can be deleted in the next step. Sanity checks fail otherwise.
Closes#37862
PiperOrigin-RevId: 683691175
In some rare occasions on Win machines (0,3-0,4%), the tests are stuck when we execute the loop of 10 DoRpc calls. We receive Deadline Exceeded for such cases. The PR bumps the deadline from 10 to 60s (no flakes for --runs_per_test=10000).
Closes#37844
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37844 from erm-g:seqFix 8644db8194
PiperOrigin-RevId: 681891281
This is the last piece of gRFC A83 (https://github.com/grpc/proposal/pull/438).
Note that although this is the first use-case for this "blackboard" mechanism, we will also use it in the future for the xDS rate-limiting filter on the gRPC server side.
Closes#37646
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37646 from markdroth:gcp_auth_filter_state 72d0d96c79
PiperOrigin-RevId: 679707134
Allows use of the party <-> party wakeup batching stuff, which reduces threadhops drastically for this transport.
Closes#37078
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37078 from ctiller:chaotic-party-3 75c32e6a64
PiperOrigin-RevId: 679685211
This PR adds templating for Python versions and updates the maximum supported Python version to 3.13. The following major changes related to templating are added:
- Minimum supported Python version and list of supported versions in `setup.py` are fetched using new template generated files called `python_version.py`
- Dockerfiles for the different Python Linux builds are now template generated.
- The "Supported Python Versions" section from READMEs of ancillary and main packages have been removed
Note: All the `python_version.py` files and Linux build `Dockerfiles` except `tools/dockerfile/grpc_artifact_python_linux_armv7/Dockerfile` in the PR are generated from the respective templates.
Further non-templated additions to add support for Python 3.13:
- install scripts and artifacts for windows, macos and linux are added manually. Later, these can be templated as well.
- updated cython bounds to 3.x
- updated twine version to solve [cgi module import error](https://github.com/pypa/twine/issues/1046)
- the twine update introduces a dependency on cryptography>=2.0. But the cryptography package doesn't support 32-bit Linux images and hence `twine check` has been disabled for x86 manylinux and x86 musllinux artifacts.
Closes#37643
PiperOrigin-RevId: 678954495
Basic building block for retries, hedging: buffer outgoing messages & metadata, allow for replay whilst buffered (with a single reader able to read once buffering ends)
Closes#37448
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37448 from ctiller:once-again-into-the-breach-my-friends 79cb121054
PiperOrigin-RevId: 677959212
Log error message instead of crashing for this API misuse.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37764
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37764 from yijiem:report-different-gauge-wont-crash 1b6e912bfc
PiperOrigin-RevId: 677944595
Fix https://github.com/grpc/grpc/issues/37727.
A better idea might be to set up Fuzzing for these APIs to find those sort of things. Maybe that can be a next step if we want to further hardening things.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37737
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37737 from yijiem:memory-leak-alts-2 6be8a49e63
PiperOrigin-RevId: 677880955
Add a ValidateCredentials API to the TLS certificate provider interface. A user can call this API to check that the credentials currently held by the certificate provider instance are valid. The definition of "valid" depends on provider that is being used. For the static data and file watcher providers, "valid" means that the credentials consist of valid PEM.
~Currently there is no check to ensure that credentials consist of valid PEM blocks before a TLS handshake commences. This PR creates a static factory for FileWatcherCertificateProvider (and marks the constructor as deprecated) which performs this validation check. The analogous work for StaticDataCertificateProvider will be done in a follow-up PR.~
Closes#37565
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37565 from matthewstevenson88:filewatcher f223228023
PiperOrigin-RevId: 677847751
Ensure OPENSSL global clean up happens after gRPC shutdown completes. OPENSSL registers an exit handler to clean up global objects, which may happen before gRPC removes all references to OPENSSL.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37768
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37768 from yousukseung:openssl-atexit-wait d3d1c964a8
PiperOrigin-RevId: 677284514
The following files have been moved:
- src/core/lib/avl/*
- src/core/lib/backoff/*
- src/core/lib/debug/event_log*
- src/core/lib/iomgr/gethostname*
- src/core/lib/iomgr/grpc_if_nametoindex*
- src/core/lib/matchers/*
- src/core/lib/uri/* (renamed from uri_parser.* to uri.*)
- src/core/lib/gprpp/* (existing src/core/util/time.cc was renamed to gpr_time.cc to avoid conflict)
Closes#36792
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/36792 from markdroth:reorg_util d4e8996f48
PiperOrigin-RevId: 676947640
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37773
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37773 from yijiem:alts-concurrent-connect-timeout 99e371f3ac
PiperOrigin-RevId: 676895049
Looks like our MSAN build is just taking longer at the moment, so increase sharding to reduce per-shard runtime.
Closes#37780
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37780 from ctiller:flake-fightas-13 07c422e977
PiperOrigin-RevId: 676869265
In some rare occasions on Win machines (0,2-0,5%), the tests are stuck before the handshake when we execute `grpc_call_start_batch`. We receive OP_COMPLETE with `Deadline Exceeded {grpc_status:4}` for such cases. The PR bumps it from 5 to 30s (no flakes for --runs_per_test=1000).
Closes#37767
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37767 from erm-g:h2test 9d1208ee1f
PiperOrigin-RevId: 676531975
Previously, if we pulled server trailing metadata *before* the call was added to the client transport then we'd never call `on_done_` on the spine and consequently never remove the call from the map. This change fixes that edge case.
In fixing it, I noticed a state in `CallState` that was both complicating the fix and completely irrelevant because we respecced earlier this year to say that ServerTrailingMetadata processing cannot be asynchronous, so I'm removing that state also.
Closes#37749
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37749 from ctiller:flake-fightas-11 847814a286
PiperOrigin-RevId: 676246259
The gRPC Core API currently requires callers to provide initial metadata before trailing metadata. You can see the C++ Callback API do this bookkeeping, for example. There is an eventual goal to be able to provide these in any order, and have gRPC do the right thing, but core is not there yet.
The proxy fixture in our end2end tests had a rare scenario in which trailing metadata from the server would show up at the proxy before initial metadata. This is part of the proxy's job: to split up batches into singular-operations that can complete in any order. There was, however, a rare flake wherein trailing metadata would complete before initial metadata, and the result was both client and server waiting on each other to respond.
This change adds a way for the proxy to defer sending trailing metadata back to the client, until after initial metadata has been sent to the client. In my testing, this eliminates the flake I had been able to reproduce 1 in 10k times using a single test. It happened more frequently across the full set of tests in our CI test suites.
Closes#37738
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37738 from drfloob:fix-proxy-fixture 6e0d7b7e6f
PiperOrigin-RevId: 676026493
Without this, we see GOAWAYs with "enter idle" irrespective of the reason being idleness or max connection age.
Closes#37709
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37709 from yashykt:ChannelIdleFilterMessage 236072e7e2
PiperOrigin-RevId: 675762380
`grpc_distribtests_gcp_python` test is [failing with permission error](https://fusion2.corp.google.com/ci;ids=1930537984/kokoro/prod:grpc%2Fcore%2Fmaster%2Flinux%2Fgrpc_distribtests_gcp_python/activity/2c6d57dd-f504-4834-9fd6-845db47ede01/log):
```
ERROR: (gcloud.functions.deploy) ResponseError: status=[403], code=[Ok], message=[Permission 'run.services.setIamPolicy' denied on resource 'projects/grpc-testing/locations/us-central1/services/grpc-gcf-distribtest-3cd3cc11-1a8f-4b88-9c3b-220f0bdf9fdc' (or resource may not exist).]
```
In the logs, looks like we're creating gen2 functions by default:
```
As of this Cloud SDK release, new functions will be deployed as 2nd gen functions by default. This is equivalent to currently deploying new with the --gen2 flag. Existing 1st gen functions will not be impacted and will continue to deploy as 1st gen functions.
```
Since we don't need functions provided by gen2 functions and use `--no-gen2` fixed the permission issue, we're adding `--no-gen2` flag to our test.
### Test:
* Passed manually run: http://sponge/e52ffb74-5a12-4e67-9f95-4638390ef57c
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#37684
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37684 from XuanWang-Amos:fix_gcf 26588c0cc3
PiperOrigin-RevId: 675703636
This test has been flaking for a while with a WSAEACCESS error on the `bind` call.
Change the loop to only create on socket at a time (on Windows) to rule out something windows-specific is not liking the fact that we are opening multiple listen sockets on the same port.
Closes#37669
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37669 from apolcyn:change_loop ffa105ba46
PiperOrigin-RevId: 675172803
Issue noticed on xds_end2end_test and is made worse worse when reducing `xds_resource_does_not_exist_timeout_ms` to 500 and running it on tsan.
Closes#37678
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37678 from yashykt:XdsClientOnTimerDebugging 1d31e28d2c
PiperOrigin-RevId: 673479242