This has been stable for a bit, everywhere that the EventEngine is
enabled. Going forward, I think the event_engine_{client|listener}
experiments can probably be used to regulate thread-pool-specific
issues.
---------
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
Most recent attempt was #34320, reverted in #34335.
The first commit here is a pure revert. The second commit fixes the
outlier_detection unit test to pass both with and without the
experiment.
This is a follow up to https://github.com/grpc/grpc/pull/34103
That pull request explicitly aimed to introduce shared library builds
for Windows (DLLs) while effecting zero material change to the existing
build pipelines. That aspiration meant that the grpc++_unsecure library
had to be effectively excluded from the build (because including it
would have also included a dependency on openssl, which makes no sense
given its purpose)
This PR addresses that by:
* Extracting the single function in grpc_tls_certificate_provider with a
dependency on openssl into a separate compilation unit
* Including that new .cc file into the grpc library
* Including grpc_tls_certificate_provider and one other source file into
grpc_unsecure for the Windows DLL build only.
* Reinstating the grpc++_unsecure library which is a prerequisite for
many tests.
* Regenerating all files affected by the changes in Bazel BUILD that
introduce the new source file.
This change does affect the operation of other build pipelines - I have
confirmed that it does not break the Linux Bazel build.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Reverts grpc/grpc#34325
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
… c-ares versions (#34314)"
This reverts commit eb37b91072.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Tested by changing c-ares to `1.14.0` in Bazel dependency and verify
that the `posix_event_engine_test` build. The test would fail though
since c-ares versions < `1.16.0` does not come with address sorting
capability so the relevant test cases will fail. But this is probably
sufficient to make the Cloud C++ CI job pass and unblock the 1.58
release.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Building out a new framing layer for chttp2.
The central idea here is to have the framing layer be solely responsible
for serialization of frames, and their deserialization - the framing
layer can reject frames that have invalid syntax - but the enacting of
what that frame means is left to a higher layer.
This class will become foundational for the promise conversion of chttp2
- by eliminating action from the parsing of frames we can reuse this
sensitive code.
Right now the new layer is inactive - there's a test that exercises it
relatively well, and not much more. In the next PRs I'll add an
experiments to enable using this layer or the existing code in the
writing and reading paths.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Proposed alternative to https://github.com/grpc/grpc/pull/34024.
This version has a simpler, faster busy-count implementation based on a
sharded set of atomic counts: fast increment/decrement operations,
relatively slower summation of total counts (which need to happen much
less frequently).
Our current implementation of `Seq`, `TrySeq` leverage some complicated
template stuff to work, which makes them hard to maintain. I've been
thinking about ways to simplify that for some time and had something
like this in mind - using a code generator that's at least a little more
understandable to code generate most of the complexity into a file that
is checkable.
Concurrently - I have a cool optimization in mind - but it requires that
we can move promises after polling, which is a contract change. I'm
going to work through the set of primitives we have in the coming weeks
and change that contract to enable the optimization.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Implement DNS using dns service for iOS.
Current limitation:
1. Using a custom name server is not supported.
2. Only supports `LookupHostname`. `LookupSRV` and `LookupTXT` are not
implemented.
3. Not tested with single stack (ipv4 or ipv6) environment
4. ~Not tested with multiple ip records per stack~ manually tested with
wsj.com
5. Not tested with multiple interface environment
Resolves a set of failures seen rolling out promises - we need to read
all of the incoming payload before doing request matching.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Why: Cleanup for chttp2_transport ahead of promise conversion - lots of
logic has become interleaved throughout chttp2, so some effort to
isolate logic out is warranted ahead of that conversion.
What: Split configuration and policy tracking for each of ping rate
throttling and abuse detection into their own modules. Add tests for
them.
Incidentally: Split channel args into their own header so that we can
split the policy stuff into separate build targets.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This PR implements a c-ares based DNS resolver for EventEngine with the
reference from the original
[grpc_ares_wrapper.h](../blob/master/src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.h).
The PosixEventEngine DNSResolver is implemented on top of that. Tests
which use the client channel resolver API
([resolver.h](../blob/master/src/core/lib/resolver/resolver.h#L54)) are
ported, namely the
[resolver_component_test.cc](../blob/master/test/cpp/naming/resolver_component_test.cc)
and the
[cancel_ares_query_test.cc](../blob/master/test/cpp/naming/cancel_ares_query_test.cc).
The WindowsEventEngine DNSResolver will use the same EventEngine's
grpc_ares_wrapper and will be worked on next.
The
[resolve_address_test.cc](https://github.com/grpc/grpc/blob/master/test/core/iomgr/resolve_address_test.cc)
which uses the iomgr
[DNSResolver](../blob/master/src/core/lib/iomgr/resolve_address.h#L44)
API has been ported to EventEngine's dns_test.cc. That leaves only 2
tests which use iomgr's API, notably the
[dns_resolver_cooldown_test.cc](../blob/master/test/core/client_channel/resolvers/dns_resolver_cooldown_test.cc)
and the
[goaway_server_test.cc](../blob/master/test/core/end2end/goaway_server_test.cc)
which probably need to be restructured to use EventEngine DNSResolver
(for one thing they override the original grpc_ares_wrapper's free
functions). I will try to tackle these in the next step.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This reverts the following PRs: #32692#33087#33093#33427#33568
These changes seem to have introduced some flaky crashes. Reverting
while I investigate.
In preparation for implementing the promise based version, separate out
the legacy call data from the filter.
There are two commits here, each representing one phase of this code
movement:
66676d398c moves `class RetryFilter` into
the header and the vtable name into that class, as this will be shared
code between the implementations
4c84f115ad then moves `class
RetryFilter::CallData` into `class RetryFilterLegacyCallData`, and moves
*that* into its own file
Doing so makes me less confused as to what I'm editing going forward.
No functionality should be affected.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This adds a new channel argument `GRPC_ARG_DSCP` which allows users to
create classified gRPC streams with a
Differentiated Services Code Point (DSCP) marking on the IP frames.
The channel argument is handled on both clients and servers, but
currently only on posix based systems.
Fixes#17225
**Background**:
In addition to what is already described is #17225, when gRPC is used in
telco systems there is often a need to classify streams of importance.
There can be multiple hops between two endpoints (e.g. between 2 telecom
operators) and some streams that are more important than others (e.g.
emergency call related or similar). By marking the IP packets using DSCP
the aware routers can make a sound decision of the prioritization.
This PR propose to use DSCP as the configuration value since its common
for both IPv4/IPv6, an alternative would be to use a config name that
includes TOS and Traffic Class.
There might be more needed regarding documentation and end2end testing,
but there I need some advice.
**References**
https://datatracker.ietf.org/doc/html/rfc2474https://www.iana.org/assignments/dscp-registry/dscp-registry.xhtml
<!--
Your pull request will be routed to the following person by default for
triaging.
If you know who should review your pull request, please remove the
mentioning below.
-->
@yashykt
More work on the dualstack backend design:
- Change ring_hash policy to delegate to pick_first instead of creating
subchannels directly.
- Note that, as mentioned in the WIP gRFC, because we lazily create the
pick_first child policies, so there's no need to swap over to a new list
as an atomic whole. As a result, we don't use the endpoint_list library
in this policy; instead, we just update a map in-place.
- Remove now-unused subchannel_list library.
More work on the dualstack backend design:
- Change round_robin to delegate to pick_first instead of creating
subchannels directly.
- Change pick_first such that when it is the child of a petiole policy,
it will unconditionally start a health watch.
- Change the client-side health checking code such that if client-side
health checking is not enabled, it will return the subchannel's raw
connectivity state.
- As part of this, we introduce a new endpoint_list library to be used
by petiole policies, which is intended to replace the existing
subchannel_list library. The only policy that will still directly
interact with subchannels is pick_first, so the relevant parts of the
subchannel_list functionality have been copied directly into that
policy. The subchannel_list library will be removed after all petiole
policies are updated to delegate to pick_first.
The address attribute interface was intended to provide a mechanism to
pass attributes separately from channel args, for values that do not
affect subchannel behavior and therefore do not need to be present in
the subchannel key, which does include channel args. However, the
mechanism as currently designed is fairly clunky and is probably not the
direction we will want to go in the long term.
Eventually, we will want some mechanism for registering channel args,
which would provide a cleaner way to indicate that a given channel arg
should not be used in the subchannel key, so that we don't need a
completely different mechanism. For now, this PR is just doing an
interim step, which is to establish a special channel arg key prefix to
indicate that an arg is not needed in the subchannel key.
Revert "Revert "[core] Add support for vsock transport"
(https://github.com/grpc/grpc/pull/33276)"
This reverts commit
c5ade3011a.
And fix the issue which broke the python build.
@markdroth@drfloob please review this PR. Thank you very much.
---------
Co-authored-by: AJ Heller <hork@google.com>
The approach of doing a recursive function call to expand the if checks
for known metadata names was tripping up an optimization clang has to
collapse that if/then tree into an optimized tree search over the set of
known strings. By unrolling that loop (with a code generator) we start
to present a pattern that clang *can* recognize, and hopefully get some
more stable and faster code generation as a benefit.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This is another attempt to add support for vsock in grpc since previous
PRs(#24551, #21745) all closed without merging.
The VSOCK address family facilitates communication between
virtual machines and the host they are running on.
This patch will introduce new scheme: [vsock:cid:port] to
support VSOCK address family.
Fixes#32738.
---------
Signed-off-by: Yadong Qi <yadong.qi@intel.com>
Co-authored-by: AJ Heller <hork@google.com>
Co-authored-by: YadongQi <YadongQi@users.noreply.github.com>
Most of these data structures need to scale a bit like per-cpu, but not
entirely. We can have more than one cpu hit the same instance in most
cases, and probably want to cap out before the hundreds of shards some
platforms have.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This test mode tries to create threads wherever it legally can to
maximize the chances of TSAN finding errors in our codebase.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
The logger uses `absl::FPrintF` to write to stdout. After reading a
number of sources online, I got the impression that `std::fwrite` which
is used by `absl::FPrintF` is atomic so there is no locking required
here.
---------
Co-authored-by: rockspore <rockspore@users.noreply.github.com>
This makes the JSON API visible as part of the C-core API, but in the
`experimental` namespace. It will be used as part of various
experimental APIs that we will be introducing in the near future, such
as the audit logging API.
This PR implements a work-stealing thread pool for use inside
EventEngine implementations. Because of historical risks here, I've
guarded the new implementation behind an experiment flag:
`GRPC_EXPERIMENTS=work_stealing`. Current default behavior is the
original thread pool implementation.
Benchmarks look very promising:
```
bazel test \
--test_timeout=300 \
--config=opt -c opt \
--test_output=streamed \
--test_arg='--benchmark_format=csv' \
--test_arg='--benchmark_min_time=0.15' \
--test_arg='--benchmark_filter=_FanOut' \
--test_arg='--benchmark_repetitions=15' \
--test_arg='--benchmark_report_aggregates_only=true' \
test/cpp/microbenchmarks:bm_thread_pool
```
2023-05-04: `bm_thread_pool` benchmark results on my local machine (64
core ThreadRipper PRO 3995WX, 256GB memory), comparing this PR to
master:
![image](https://user-images.githubusercontent.com/295906/236315252-35ed237e-7626-486c-acfa-71a36f783d22.png)
2023-05-04: `bm_thread_pool` benchmark results in the Linux RBE
environment (unsure of machine configuration, likely small), comparing
this PR to master.
![image](https://user-images.githubusercontent.com/295906/236317164-2c5acbeb-fdac-4737-9b2d-4df9c41cb825.png)
---------
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
Reverts grpc/grpc#32924. This breaks the build again, unfortunately.
From `test/core/event_engine/cf:cf_engine_test`:
```
error: module .../grpc/test/core/event_engine/cf:cf_engine_test does not depend on a module exporting 'grpc/support/port_platform.h'
```
@sampajano I recommend looking into CI tests to catch iOS problems
before merging. We can enable EventEngine experiments in the CI
generally once this PR lands, but this broken test is not one of those
experiments. A normal build should have caught this.
cc @HannahShiSFB