Experiment 1: On RST_STREAM: reduce MAX_CONCURRENT_STREAMS for one round
trip.
Experiment 2: If a settings frame is outstanding with a lower
MAX_CONCURRENT_STREAMS than is configured, and we receive a new incoming
stream that would exceed the new cap, randomly reject it.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Isolate ping callback tracking to its own file.
Also takes the opportunity to simplify keepalive code by applying the
ping timeout to all pings.
Adds an experiment to allow multiple pings outstanding too (this was
originally an accidental behavior change of the work, but one that I
think may be useful going forward).
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
More changes as part of the dualstack design:
- Change resolver and LB policy APIs to support multiple addresses per
endpoint. Specifically, replace `ServerAddress` with
`EndpointAddresses`, which encodes more than one address. Per-address
channel args are retained at the same level, so they are now
per-endpoint. For now, `EndpointAddress` provides a single-address ctor
and a single-address accessor for backward compatibility, so
`ServerAdress` is an alias for `EndpointAddresses`; eventually, this
alias and the single-address methods will be removed.
- Add an `EndpointAddressSet` class, which represents an unordered set
of addresses to be used as a map key. This will be used in a number of
LB policies that need to store per-endpoint state.
- Change the LB policy API's `ChannelControlHelper::CreateSubchannel()`
method to take the address and per-endpoint channel args as separate
parameters, so that we don't need to construct a legacy `ServerAddress`
object as we create a new subchannel for each address in the endpoint.
- Change pick_first to flatten the address list.
- Change ring_hash to use `EndpointAddressSet` as the key for its
endpoint map, and to use the first address of the endpoint as the hash
key.
- Change WRR to use `EndpointAddressSet` as the key for its endpoint
weight map.
Note that support for multiple addresses per endpoint is guarded in RR
by the existing `round_robin_delegate_to_pick_fist` experiment and in
WRR by the existing `wrr_delegate_to_pick_first` experiment.
This PR does *not* include support for multiple addresses per endpoint
for the outlier_detection or xds_override_host LB policies; those will
come in subsequent PRs.
Fix: https://github.com/grpc/grpc/issues/33890
In case of abort, currently we don't log anything, this change added an
`AbortError` as the default error for abort, if any other error
happened, we'll log the error so user will be aware of the other error.
Related gRFC: https://github.com/grpc/proposal/pull/388
### Testing
* Tested locally, after this change, any non-AbortError will be logged,
sample log:
```
ERROR:grpc._server:Exception happened while abort: Other exception happened!
Traceback (most recent call last):
File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/9/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests/unit/_abort_test.native.runfiles/com_github_grpc_grpc/src/python/grpcio/grpc/_server.py", line 553, in _call_behavior
response_or_iterator = behavior(argument, context)
File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/9/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests/unit/_abort_test.native.runfiles/com_github_grpc_grpc/src/python/grpcio_tests/tests/unit/_abort_test.py", line 95, in abort_with_status_unary_unary_raise_additional_exception
raise Exception("Other exception happened!")
Exception: Other exception happened!
```
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Support Python 3.12.
### Testing
* Passed all Distribution Tests.
* Also tested locally by installing 3.12 artifact.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This has been stable for a bit, everywhere that the EventEngine is
enabled. Going forward, I think the event_engine_{client|listener}
experiments can probably be used to regulate thread-pool-specific
issues.
---------
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
Most recent attempt was #34320, reverted in #34335.
The first commit here is a pure revert. The second commit fixes the
outlier_detection unit test to pass both with and without the
experiment.
This is a follow up to https://github.com/grpc/grpc/pull/34103
That pull request explicitly aimed to introduce shared library builds
for Windows (DLLs) while effecting zero material change to the existing
build pipelines. That aspiration meant that the grpc++_unsecure library
had to be effectively excluded from the build (because including it
would have also included a dependency on openssl, which makes no sense
given its purpose)
This PR addresses that by:
* Extracting the single function in grpc_tls_certificate_provider with a
dependency on openssl into a separate compilation unit
* Including that new .cc file into the grpc library
* Including grpc_tls_certificate_provider and one other source file into
grpc_unsecure for the Windows DLL build only.
* Reinstating the grpc++_unsecure library which is a prerequisite for
many tests.
* Regenerating all files affected by the changes in Bazel BUILD that
introduce the new source file.
This change does affect the operation of other build pipelines - I have
confirmed that it does not break the Linux Bazel build.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
* Remove `fetch_build_eggs` since they're deprecated and those deps will
be installed by `setuptools`.
* Fix indentation on `run_test` so we don't miss `native` test cases.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
### Background
* `distutils` is deprecated with removal planned for Python 3.12
([pep-0632](https://peps.python.org/pep-0632/)), thus we're trying to
replace all distutils usage with setuptools.
* Please note that user still have access to `distutils` if setuptools
is installed and `SETUPTOOLS_USE_DISTUTILS` is set to `local` (The
default in setuptools, more details can be found [in this
discussion](https://github.com/pypa/setuptools/issues/2806#issuecomment-1193336591)).
### How we decide the replacement
* We're following setuptools [Porting from Distutils
guide](https://setuptools.pypa.io/en/latest/deprecated/distutils-legacy.html#porting-from-distutils)
when deciding the replacement.
#### Replacement not mentioned in the guide
* Replaced `distutils.utils.get_platform()` with
`sysconfig.get_platform()`.
* Based on the [answer
here](https://stackoverflow.com/questions/71664875/what-is-the-replacement-for-distutils-util-get-platform),
and also checked the document that `sysconfig.get_platform()` is good
enough for our use cases.
* Replaced `DistutilsOptionError` with `OptionError`.
* `setuptools.error` is exporting it as `OptionError` [in the
code](https://github.com/pypa/setuptools/blob/v59.6.0/setuptools/errors.py).
* Upgrade `setuptools` in `test_packages.sh` and changed the version
ping to `59.6.0` in `build_artifact_python.bat`.
* `distutils.errors.*` is not fully re-exported until `59.0.0` (See
[this issue](https://github.com/pypa/setuptools/issues/2698) for more
details).
### Changes not included in this PR
* We're patching some compiler related functions provided by distutils
in our code
([example](ee4efc31c1/src/python/grpcio/_spawn_patch.py (L30))),
but since `setuptools` doesn't have similar interface (See [this issue
for more details](https://github.com/pypa/setuptools/issues/2806)), we
don't have a clear path to replace them yet.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
* Before change:
* Test hangs
[here](https://github.com/grpc/grpc/blob/master/src/python/grpcio/grpc/_cython/_cygrpc/aio/call.pyx.pxi#L250)(Called
from
[here](3b23fe62ca/src/python/grpcio/grpc/aio/_call.py (L261)))
waiting for status.
* After change, we'll manually set status. We're also printing traceback
like the following:
```
Traceback (most recent call last):
File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio/grpc/aio/_call.py", line 492, in _writ
e
await self._cython_call.send_serialized_message(serialized_request)
File "src/python/grpcio/grpc/_cython/_cygrpc/aio/call.pyx.pxi", line 379, in send_serialized_message
File "src/python/grpcio/grpc/_cython/_cygrpc/aio/callback_common.pyx.pxi", line 163, in _send_message
File "src/python/grpcio/grpc/_cython/_cygrpc/aio/callback_common.pyx.pxi", line 160, in _cython.cygrpc._send_message
File "src/python/grpcio/grpc/_cython/_cygrpc/aio/callback_common.pyx.pxi", line 106, in execute_batch
_cython.cygrpc.ExecuteBatchError: Failed grpc_call_start_batch: 11 with grpc_call_error value: 'GRPC_CALL_ERROR_INVALID_MESSAGE'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio_tests/tests_aio/unit/_test_base.py", l
ine 31, in wrapper
return loop.run_until_complete(f(*args, **kwargs))
File "/usr/local/google/home/xuanwn/.pyenv/versions/3.10.9/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio_tests/tests_aio/unit/metadata_test.py"
, line 310, in test_stream_unary
await call.write(_REQUEST)
File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio/grpc/aio/_call.py", line 517, in write
await self._write(request)
File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio/grpc/aio/_call.py", line 495, in _writ
e
await self._raise_for_status()
File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio/grpc/aio/_call.py", line 263, in _rais
e_for_status
raise _create_rpc_error(
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
status = StatusCode.INTERNAL
details = "Internal error from Core"
debug_error_string = "Failed grpc_call_start_batch: 11 with grpc_call_error value: 'GRPC_CALL_ERROR_INVALID_MESSAGE'"
>
```
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Proposed alternative to https://github.com/grpc/grpc/pull/34024.
This version has a simpler, faster busy-count implementation based on a
sharded set of atomic counts: fast increment/decrement operations,
relatively slower summation of total counts (which need to happen much
less frequently).
- Extract build metadata for some external dependencies from bazel
build. This is achieved by letting extract_metadata_from_bazel_xml.py
analyze some external libraries and sources. The logic is basically the
same as for internal libraries, I only needed to teach
extract_metadata_from_bazel_xml.py which external libraries it is
allowed to analyze.
* currently, the list of source files is automatically determined for
`z`, `upb`, `re2` and `gtest` dependencies (at least for the case where
we're building in "embedded" mode - e.g. mostly native extensions for
python, php, ruby etc. - cmake has the ability to replace some of these
dependencies by actual cmake dependency.)
- Eliminate the need for manually written gen_build_yaml.py for some
dependencies.
- Make the info on target dependencies in build_autogenerated.yaml more
accurate and complete. Until now, there were some depdendencies that
were allowed to show up in build_autogenerated.yaml and some that were
being skipped. This made generating the CMakeLists.txt and Makefile
quite confusing (since some dependencies are being explicitly mentioned
and some had to be assumed by the build system).
- Overhaul the Makefile
* the Makefile is currently only used internally (e.g. for ruby and PHP
builds)
* until now, the makefile wasn't really using the info about which
targets depend on what libraries, but it was effectively hardcoding the
depedendency data (by magically "knowing" what is the list of all the
stuff that e.g. "grpc" depends on).
* After the overhaul, the Makefile.template now actually looks at the
library dependencies and uses them when generating the makefile. This
gives a more correct and easier to maintain makefile.
* since csharp is no longer on the master branch, remove all mentions of
"csharp" targets in the Makefile.
Other notable changes:
- make extract_metadata_from_bazel_xml.py capable of resolving workspace
bind() rules (so that it knows the real name of the target that is
referred to as e.g. `//external:xyz`)
TODO:
- [DONE] ~~pkgconfig C++ distribtest~~
- [DONE} ~~update third_party/README to reflect changes in how some deps
get updated now.~~
Planned followups:
- cleanup naming of some targets in build metadata and buildsystem
templates: libssl vs boringssl, ares vs cares etc.
- further cleanup of Makefile
- further cleanup of CMakeLists.txt
- remote the need from manually hardcoding extra metadata for targets in
build_autogenerated.yaml. Either add logic that determines the
properties of targets automatically, or use metadata from bazel BUILD.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
As per the title.
<!--
Your pull request will be routed to the following person by default for
triaging.
If you know who should review your pull request, please remove the
mentioning below.
-->
@donnadionne
Intercepted call class expect a timeout as parameters but a deadline is
provided instead. Only the UnaryUnary class is correctly created with a
timeout while the others remains with the deadlines. This commit fix the
instanciation of the other ones.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Implement DNS using dns service for iOS.
Current limitation:
1. Using a custom name server is not supported.
2. Only supports `LookupHostname`. `LookupSRV` and `LookupTXT` are not
implemented.
3. Not tested with single stack (ipv4 or ipv6) environment
4. ~Not tested with multiple ip records per stack~ manually tested with
wsj.com
5. Not tested with multiple interface environment
Refactored OpenCensus context propagation flow, now propagation happens
for each call and context will be automatically propagated from gRPC
server to gRPC client.
We're using `execution_context` in OpenCensus since the context is
related to OpenCensus and it helps wrap `contextVar` for us.
### Testing
* Added a new Bazel test case for context propagation.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Fix: #33643.
This change adds `__reduce__` to `AioRpcError` to allow pickle.
### Testing
Added bazel unit test, without this change, test will fail with error:
```
TypeError: AioRpcError.__init__() missing 3 required positional arguments: 'code', 'initial_metadata', and 'trailing_metadata'
```
Why: Cleanup for chttp2_transport ahead of promise conversion - lots of
logic has become interleaved throughout chttp2, so some effort to
isolate logic out is warranted ahead of that conversion.
What: Split configuration and policy tracking for each of ping rate
throttling and abuse detection into their own modules. Add tests for
them.
Incidentally: Split channel args into their own header so that we can
split the policy stuff into separate build targets.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This PR implements a c-ares based DNS resolver for EventEngine with the
reference from the original
[grpc_ares_wrapper.h](../blob/master/src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.h).
The PosixEventEngine DNSResolver is implemented on top of that. Tests
which use the client channel resolver API
([resolver.h](../blob/master/src/core/lib/resolver/resolver.h#L54)) are
ported, namely the
[resolver_component_test.cc](../blob/master/test/cpp/naming/resolver_component_test.cc)
and the
[cancel_ares_query_test.cc](../blob/master/test/cpp/naming/cancel_ares_query_test.cc).
The WindowsEventEngine DNSResolver will use the same EventEngine's
grpc_ares_wrapper and will be worked on next.
The
[resolve_address_test.cc](https://github.com/grpc/grpc/blob/master/test/core/iomgr/resolve_address_test.cc)
which uses the iomgr
[DNSResolver](../blob/master/src/core/lib/iomgr/resolve_address.h#L44)
API has been ported to EventEngine's dns_test.cc. That leaves only 2
tests which use iomgr's API, notably the
[dns_resolver_cooldown_test.cc](../blob/master/test/core/client_channel/resolvers/dns_resolver_cooldown_test.cc)
and the
[goaway_server_test.cc](../blob/master/test/core/end2end/goaway_server_test.cc)
which probably need to be restructured to use EventEngine DNSResolver
(for one thing they override the original grpc_ares_wrapper's free
functions). I will try to tackle these in the next step.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This reverts the following PRs: #32692#33087#33093#33427#33568
These changes seem to have introduced some flaky crashes. Reverting
while I investigate.
In preparation for implementing the promise based version, separate out
the legacy call data from the filter.
There are two commits here, each representing one phase of this code
movement:
66676d398c moves `class RetryFilter` into
the header and the vtable name into that class, as this will be shared
code between the implementations
4c84f115ad then moves `class
RetryFilter::CallData` into `class RetryFilterLegacyCallData`, and moves
*that* into its own file
Doing so makes me less confused as to what I'm editing going forward.
No functionality should be affected.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
More work on the dualstack backend design:
- Change round_robin to delegate to pick_first instead of creating
subchannels directly.
- Change pick_first such that when it is the child of a petiole policy,
it will unconditionally start a health watch.
- Change the client-side health checking code such that if client-side
health checking is not enabled, it will return the subchannel's raw
connectivity state.
- As part of this, we introduce a new endpoint_list library to be used
by petiole policies, which is intended to replace the existing
subchannel_list library. The only policy that will still directly
interact with subchannels is pick_first, so the relevant parts of the
subchannel_list functionality have been copied directly into that
policy. The subchannel_list library will be removed after all petiole
policies are updated to delegate to pick_first.
The address attribute interface was intended to provide a mechanism to
pass attributes separately from channel args, for values that do not
affect subchannel behavior and therefore do not need to be present in
the subchannel key, which does include channel args. However, the
mechanism as currently designed is fairly clunky and is probably not the
direction we will want to go in the long term.
Eventually, we will want some mechanism for registering channel args,
which would provide a cleaner way to indicate that a given channel arg
should not be used in the subchannel key, so that we don't need a
completely different mechanism. For now, this PR is just doing an
interim step, which is to establish a special channel arg key prefix to
indicate that an arg is not needed in the subchannel key.
- Switched from yapf to black
- Reconfigure isort for black
- Resolve black/pylint idiosyncrasies
Note: I used `--experimental-string-processing` because black was
producing "implicit string concatenation", similar to what described
here: https://github.com/psf/black/issues/1837. While currently this
feature is experimental, it will be enabled by default:
https://github.com/psf/black/issues/2188. After running black with the
new string processing so that the generated code merges these `"hello" "
world"` strings concatenations, then I removed
`--experimental-string-processing` for stability, and regenerated the
code again.
To the reviewer: don't even try to open "Files Changed" tab 😄 It's
better to review commit-by-commit, and ignore `run black and isort`.