This PR implements a c-ares based DNS resolver for EventEngine with the
reference from the original
[grpc_ares_wrapper.h](../blob/master/src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.h).
The PosixEventEngine DNSResolver is implemented on top of that. Tests
which use the client channel resolver API
([resolver.h](../blob/master/src/core/lib/resolver/resolver.h#L54)) are
ported, namely the
[resolver_component_test.cc](../blob/master/test/cpp/naming/resolver_component_test.cc)
and the
[cancel_ares_query_test.cc](../blob/master/test/cpp/naming/cancel_ares_query_test.cc).
The WindowsEventEngine DNSResolver will use the same EventEngine's
grpc_ares_wrapper and will be worked on next.
The
[resolve_address_test.cc](https://github.com/grpc/grpc/blob/master/test/core/iomgr/resolve_address_test.cc)
which uses the iomgr
[DNSResolver](../blob/master/src/core/lib/iomgr/resolve_address.h#L44)
API has been ported to EventEngine's dns_test.cc. That leaves only 2
tests which use iomgr's API, notably the
[dns_resolver_cooldown_test.cc](../blob/master/test/core/client_channel/resolvers/dns_resolver_cooldown_test.cc)
and the
[goaway_server_test.cc](../blob/master/test/core/end2end/goaway_server_test.cc)
which probably need to be restructured to use EventEngine DNSResolver
(for one thing they override the original grpc_ares_wrapper's free
functions). I will try to tackle these in the next step.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This reverts the following PRs: #32692#33087#33093#33427#33568
These changes seem to have introduced some flaky crashes. Reverting
while I investigate.
In preparation for implementing the promise based version, separate out
the legacy call data from the filter.
There are two commits here, each representing one phase of this code
movement:
66676d398c moves `class RetryFilter` into
the header and the vtable name into that class, as this will be shared
code between the implementations
4c84f115ad then moves `class
RetryFilter::CallData` into `class RetryFilterLegacyCallData`, and moves
*that* into its own file
Doing so makes me less confused as to what I'm editing going forward.
No functionality should be affected.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
More work on the dualstack backend design:
- Change round_robin to delegate to pick_first instead of creating
subchannels directly.
- Change pick_first such that when it is the child of a petiole policy,
it will unconditionally start a health watch.
- Change the client-side health checking code such that if client-side
health checking is not enabled, it will return the subchannel's raw
connectivity state.
- As part of this, we introduce a new endpoint_list library to be used
by petiole policies, which is intended to replace the existing
subchannel_list library. The only policy that will still directly
interact with subchannels is pick_first, so the relevant parts of the
subchannel_list functionality have been copied directly into that
policy. The subchannel_list library will be removed after all petiole
policies are updated to delegate to pick_first.
The address attribute interface was intended to provide a mechanism to
pass attributes separately from channel args, for values that do not
affect subchannel behavior and therefore do not need to be present in
the subchannel key, which does include channel args. However, the
mechanism as currently designed is fairly clunky and is probably not the
direction we will want to go in the long term.
Eventually, we will want some mechanism for registering channel args,
which would provide a cleaner way to indicate that a given channel arg
should not be used in the subchannel key, so that we don't need a
completely different mechanism. For now, this PR is just doing an
interim step, which is to establish a special channel arg key prefix to
indicate that an arg is not needed in the subchannel key.
Revert "Revert "[core] Add support for vsock transport"
(https://github.com/grpc/grpc/pull/33276)"
This reverts commit
c5ade3011a.
And fix the issue which broke the python build.
@markdroth@drfloob please review this PR. Thank you very much.
---------
Co-authored-by: AJ Heller <hork@google.com>
This is another attempt to add support for vsock in grpc since previous
PRs(#24551, #21745) all closed without merging.
The VSOCK address family facilitates communication between
virtual machines and the host they are running on.
This patch will introduce new scheme: [vsock:cid:port] to
support VSOCK address family.
Fixes#32738.
---------
Signed-off-by: Yadong Qi <yadong.qi@intel.com>
Co-authored-by: AJ Heller <hork@google.com>
Co-authored-by: YadongQi <YadongQi@users.noreply.github.com>
Most of these data structures need to scale a bit like per-cpu, but not
entirely. We can have more than one cpu hit the same instance in most
cases, and probably want to cap out before the hundreds of shards some
platforms have.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This reverts the changes made to ring_hash in #29872. The comment in the
picker code specifically says not to change these variables to an
unsigned type, but that's exactly what that PR did. I don't know if this
has actually been causing any problems, but given that we duplicated
this algorithm (and that comment) from elsewhere without doing a
detailed analysis of it, it seems prudent to stick with the types that
the original code suggested were important.
To avoid causing problems for ObjC, I have changed this such that
ring_hash is not built unless we are building with xDS support, which we
exclude on mobile. Currently, there is no way to use ring_hash without
xDS, although that might change in the future; if it does, we can deal
with any problems that arise at that point.
This test mode tries to create threads wherever it legally can to
maximize the chances of TSAN finding errors in our codebase.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
The logger uses `absl::FPrintF` to write to stdout. After reading a
number of sources online, I got the impression that `std::fwrite` which
is used by `absl::FPrintF` is atomic so there is no locking required
here.
---------
Co-authored-by: rockspore <rockspore@users.noreply.github.com>
This PR implements a work-stealing thread pool for use inside
EventEngine implementations. Because of historical risks here, I've
guarded the new implementation behind an experiment flag:
`GRPC_EXPERIMENTS=work_stealing`. Current default behavior is the
original thread pool implementation.
Benchmarks look very promising:
```
bazel test \
--test_timeout=300 \
--config=opt -c opt \
--test_output=streamed \
--test_arg='--benchmark_format=csv' \
--test_arg='--benchmark_min_time=0.15' \
--test_arg='--benchmark_filter=_FanOut' \
--test_arg='--benchmark_repetitions=15' \
--test_arg='--benchmark_report_aggregates_only=true' \
test/cpp/microbenchmarks:bm_thread_pool
```
2023-05-04: `bm_thread_pool` benchmark results on my local machine (64
core ThreadRipper PRO 3995WX, 256GB memory), comparing this PR to
master:
![image](https://user-images.githubusercontent.com/295906/236315252-35ed237e-7626-486c-acfa-71a36f783d22.png)
2023-05-04: `bm_thread_pool` benchmark results in the Linux RBE
environment (unsure of machine configuration, likely small), comparing
this PR to master.
![image](https://user-images.githubusercontent.com/295906/236317164-2c5acbeb-fdac-4737-9b2d-4df9c41cb825.png)
---------
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
Reverts grpc/grpc#32924. This breaks the build again, unfortunately.
From `test/core/event_engine/cf:cf_engine_test`:
```
error: module .../grpc/test/core/event_engine/cf:cf_engine_test does not depend on a module exporting 'grpc/support/port_platform.h'
```
@sampajano I recommend looking into CI tests to catch iOS problems
before merging. We can enable EventEngine experiments in the CI
generally once this PR lands, but this broken test is not one of those
experiments. A normal build should have caught this.
cc @HannahShiSFB
1. `GrpcAuthorizationEngine` creates the logger from the given config in
its ctor.
2. `Evaluate()` invokes audit logging when needed.
---------
Co-authored-by: rockspore <rockspore@users.noreply.github.com>
Add audit condition and audit logger config into `grpc_core::Rbac`.
Support translation of audit logging options from authz policy to it.
Audit logging options in authz policy looks like:
```json
{
"audit_logging_options": {
"audit_condition": "ON_DENY",
"audit_loggers": [
{
"name": "logger",
"config": {},
"is_optional": false
}
]
}
}
```
which is consistent with what's in the xDS RBAC proto but a little
flattened.
---------
Co-authored-by: rockspore <rockspore@users.noreply.github.com>
Audit logging APIs for both built-in loggers and third-party logger
implementations.
C++ uses using decls referring to C-Core APIs.
---------
Co-authored-by: rockspore <rockspore@users.noreply.github.com>
Third-party loggers will be added in subsequent PRs once the logger
factory APIs are available to validate the configs here.
This registry is used in `xds_http_rbac_filter.cc` to generate service
config json.
This paves the way for making pick_first the universal leaf policy (see
#32692), which will be needed for the dualstack design. That change will
require changing pick_first to see both the raw connectivity state and
the health-checking connectivity state of a subchannel, so that we can
enable health checking when pick_first is used underneath round_robin
without actually changing the pick_first connectivity logic (currently,
pick_first always disables health checking). To make it possible to do
that, this PR moves the health checking code out of the subchannel and
into a separate API using the same data-watcher mechanism that was added
for ORCA OOB calls.
The PR also creates a separate BUILD target for:
- chttp2 context list
- iomgr buffer_list
- iomgr internal errqueue
This would allow the context list to be included as standalone
dependencies for EventEngine implementations.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
@sampajano
The very non-trivial upgrade of third_party/protobuf to 22.x
This PR strives to be as small as possible and many changes that were
compatible with protobuf 21.x and didn't have to be merged atomically
with the upgrade were already merged.
Due to the complexity of the upgrade, this PR wasn't created
automatically by a tool, but manually. Subsequent upgraded of
third_party/protobuf with our OSS release script should work again once
this change is merged.
This is best reviewed commit-by-commit, I tried to group changes in
logical areas.
Notable changes:
- the upgrade of third_party/protobuf submodule, the bazel protobuf
dependency itself
- upgrade of UPB dependency to 22.x (in the past, we used to always
upgrade upb to "main", but upb now has release branch as well). UPB
needs to be upgraded atomically with protobuf since there's a de-facto
circular dependency (new protobuf depends on new upb, which depends on
new protobuf for codegen).
- some protobuf and upb bazel rules are now aliases, so `
extract_metadata_from_bazel_xml.py` and `gen_upb_api_from_bazel_xml.py`
had to be modified to be able to follow aliases and reach the actual
aliased targets.
- some protobuf public headers were renamed, so especially
`src/compiler` needed to be updated to use the new headers.
- protobuf and upb now both depend on utf8_range project, so since we
bundle upb with grpc in some languages, we now have to bundle utf8_range
as well (hence changes in build for python, PHP, objC, cmake etc).
- protoc now depends on absl and utf8_range (previously protobuf had
absl dependency, but not for the codegen part), so python's
make_grpcio_tools.py required partial rewrite to be able to handle those
dependencies in the grpcio_tools build.
- many updates and fixes required for C++ distribtests (currently they
all pass, but we'll probably need to follow up, make protobuf's and
grpc's handling of dependencies more aligned and revisit the
distribtests)
- bunch of other changes mostly due to overhaul of protobuf's and upb's
internal build layout.
TODOs:
- [DONE] make sure IWYU and clang_tidy_code pass
- create a list of followups (e.g. work to reenable the few tests I had
to disable and to remove workaround I had to use)
- [DONE in cl/523706129] figure out problem(s) with internal import
---------
Co-authored-by: Craig Tiller <ctiller@google.com>
Notes:
- `+trace` fixtures haven't run since 2016, so they're disabled for now
(7ad2d0b463 (diff-780fce7267c34170c1d0ea15cc9f65a7f4b79fefe955d185c44e8b3251cf9e38R76))
- all current fixtures define `FEATURE_MASK_SUPPORTS_AUTHORITY_HEADER`
and hence `authority_not_supported` has not been run in years - deleted
- bad_hostname similarly hasn't been triggered in a long while, so
deleted
- load_reporting_hook has never been enabled, so deleted
(f23fb4cf31/test/core/end2end/generate_tests.bzl (L145-L148))
- filter_latency & filter_status_code rely on global variables and so
don't convert particularly cleanly - and their value seems marginal, so
deleted
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This PR also centralizes the client channel resolver selection. Resolver
selection is still done using the plugin system, but when the Ares and
native client channel resolvers go away, we can consider bootstrapping
this differently.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
(hopefully last try)
Add new channel arg GRPC_ARG_ABSOLUTE_MAX_METADATA_SIZE as hard limit
for metadata. Change GRPC_ARG_MAX_METADATA_SIZE to be a soft limit.
Behavior is as follows:
Hard limit
(1) if hard limit is explicitly set, this will be used.
(2) if hard limit is not explicitly set, maximum of default and soft
limit * 1.25 (if soft limit is set) will be used.
Soft limit
(1) if soft limit is explicitly set, this will be used.
(2) if soft limit is not explicitly set, maximum of default and hard
limit * 0.8 (if hard limit is set) will be used.
Requests between soft and hard limit will be rejected randomly, requests
above hard limit will be rejected.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This PR adds the view `grpc.io/client/api_latency` for GCP Observability
which aims to collect the end-to-end time taken by a call.
Changes made to support this -
1) A global interceptor factory registration is created for stats
plugins.
2) OpenCensus plugin now provides a new interceptor that's responsible
for collecting the new latency.
3) Gcp Observability registers this plugin.
4) A new OpenCensus measurement and view is created for api latency.
Note that this is internal as of now, since it's not clear if it should
be exposed as public experimental API. Leaving that decision for the
future.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This is a big rewrite of global config.
It does a few things, all somewhat intertwined:
1. centralize the list of configuration we have to a yaml file that can
be parsed, and code generated from it
2. add an initialization and a reset stage so that config vars can be
centrally accessed very quickly without the need for caching them
3. makes the syntax more C++ like (less macros!)
4. (optionally) adds absl flags to the OSS build
This first round of changes is intended to keep the system where it is
without major changes. We pick up absl flags to match internal code and
remove one point of deviation - but importantly continue to read from
the environment variables. In doing so we don't force absl flags on our
customers - it's possible to configure grpc without the flags - but
instead allow users that do use absl flags to configure grpc using that
mechanism. Importantly this lets internal customers configure grpc the
same everywhere.
Future changes along this path will be two-fold:
1. Move documentation generation into the code generation step, so that
within the source of truth yaml file we can find all documentation and
data about a configuration knob - eliminating the chance of forgetting
to document something in all the right places.
2. Provide fuzzing over configurations. Currently most config variables
get stashed in static constants across the codebase. To fuzz over these
we'd need a way to reset those cached values between fuzzing rounds,
something that is terrifically difficult right now, but with these
changes should simply be a reset on `ConfigVars`.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This change mostly aims to get OpenCensus to use the new
ServerCallTracer interface. Note that the interfaces nor the code are in
their final states. There are a bunch of moving pieces, but I thought
this might be a nice mid-step to check-in and make sure that our
internal traces can also work with these changes.
Overall changes -
1) call_tracer.h shows what the hierarchy of new call tracer interfaces
looks like. Open to renaming suggestions.
2) Moved most of the common interface between `CallAttemptTracer` and
`ServerCallTracer` into a common `CallTracerInterface`. We should be
able to eventually move `RecordReceivedTrailingMetadata` and `RecordEnd`
as well to these common interfaces, but it requires some additional
work.
3) The compression filter is now responsible for recording the recv and
send messages for both the subchannel call and the server, and adds in
ability to record compressed and decompressed messages as well.
4) The OpenCensus server filter now uses the new `ServerCallTracer`
interface, and so doesn't need to be a filter anymore.
5) A new ServerCallTracerFilter was added. Ideally, we should be able to
move it to the current connected filter, but it is in a bit of an
interesting state right now, so I would prefer making those changes in a
separate PR with Craig's eyes on it.
6) A new context element `GRPC_CONTEXT_CALL_TRACER_ANNOTATION_INTERFACE`
was created that replaces the old `GRPC_CONTEXT_CALL_TRACER`, and the
new `GRPC_CONTEXT_CALL_TRACER` is mainly to pass the `CallAttemptTracer`
down the stack. This should go away in the new promise-based world.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
<!-- Reviewable:start -->
- - -
This change is [<img src="https://reviewable.io/review_button.svg"
height="34" align="absmiddle"
alt="Reviewable"/>](https://reviewable.io/reviews/grpc/grpc/32618)
<!-- Reviewable:end -->