I'm fairly certain that this path should be non-blocking (and making it
so makes the promise based code far more tractable).
This moves the blocking behavior into the blocking server_cc.cc function
that calls `grpc_server_shutdown_and_notify` instead of in that
non-blocking function.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
There's *very* little difference in cost for a pooled arena allocation
and a tcmalloc allocation - but the pooled allocation causes memory
stranding for call lifetime, whereas the tcmalloc allocation allows that
to be shared between calls - leading to a much lower overall cost.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
~~NB: I haven't tested this at all and am hoping the CI will tell me
where I've (undoubtedly) messed something up.~~ Edit: looks like CI is
now clear!
BoringSSL's gas-compatible assembly files, like its C files, are now
wrapped with preprocessor ifdefs to capture which platforms each file
should be enabled on. This means that, provided the platform can process
.S files it all (i.e. not Windows), we no longer need to detect the
exact CPU architecture in the build.
Switch gRPC's build to take advantage of this. I've retained
BUILD_OVERRIDE_BORING_SSL_ASM_PLATFORM, on the off chance anyone is
using it to cross-compile between Windows and non-Windows, though I
doubt that works particularly well.
As part of this, restore assembly optimizations in a few places where
they were seemingly disabled for issues relating to this:
- https://github.com/grpc/grpc/pull/31747 had to disable the assembly,
because at the time assembly required the library be built differently
for each architecture and then stitched back together. This should now
work.
- tools/run_tests/run_tests.py disabled x86 assembly due to some issues
with CMAKE_SYSTEM_PROCESSOR in a Docker image. This too should now be
moot.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Add "bazelified" non-bazel tests. See tools/bazelify_tests/README.md for
the core idea.
- add a bunch of test targets that run under docker and execute tests
that correspond to `run_tests.py -l LANG ...`
- many more tests can be added in the future
- to enable running some of the C/C++ portability tests easily, added
support for `--cmake_extra_configure_args` in run_tests.py (the change
is fairly small).
Example passing build that shows how test results are structured:
https://source.cloud.google.com/results/invocations/21295351-a3e3-4be1-b6e9-aaf52195a044/targets
The previous version (`3.12`) is 7 years old and does not support the
newest Python 3 versions. This causes issues to move certain test
targets (which depends on `pyyaml`) to Python 3 when some CI environment
(e.g. `arm64v8/debian:11`) does not have Python 2 installed. And in
general, we should move away from Python 2. Thus, updated `pyyaml` to
the latest version.
This hopefully should also fix the
`prod:grpc/core/master/linux/arm64/grpc_bazel_test_c_cpp` job breakage.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Rolls forward https://github.com/grpc/grpc/pull/33871
Second and third commits here fix internal build issues
In particular, add a `// IWYU pragma: no_include <ares_build.h>` since
`ares.h` [includes that
anyways](bad62225b7/include/ares.h (L23))
(and seems unlikely for that to change since it would be breaking)
This enables both of the `event_engine_listener` and `work_stealing`
experiments together, which we expect will have better performance. The
benchmark-config-generation script required some light modification to
support running multiple experiments at the same time.
Implement DNS using dns service for iOS.
Current limitation:
1. Using a custom name server is not supported.
2. Only supports `LookupHostname`. `LookupSRV` and `LookupTXT` are not
implemented.
3. Not tested with single stack (ipv4 or ipv6) environment
4. ~Not tested with multiple ip records per stack~ manually tested with
wsj.com
5. Not tested with multiple interface environment
Need the ability to override server-side keepalive permit without calls
default without affecting client-side settings.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Normally, c-ares related fds are destroyed after all DNS resolution is
finished in [this code
path](c82d31677a/src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc (L210)).
Also there are some fds that c-ares may fail to open or write to
initially, and c-ares will close them internally before grpc ever knows
about them.
But if:
1) c-ares opens a socket and successfully writes a request on it
2) then a subsequent read fails
Then c-ares will close the fd in [this code
path](bad62225b7/src/lib/ares_process.c (L740)),
but gRPC will have a reference on the fd and will still use it
afterwards.
Fix here is to leverage the c-ares socket-override API to properly track
fd ownership between c-ares and grpc.
Related: internal issue b/292203138
Over the past 5 days, this experiment has not introduced any new flakes,
nor increased any flake rates. Let's enable it for debug builds. To
prevent issues over the weekend, I plan to merge it next week, July 31st
(with announcement).
This adds a new GKE benchmark job, which runs the set of "dashboard"
scenarios for every gRPC experiment configured in the script. Results
are published to BigQuery at
`e2e_benchmarks.ci_cxx_experiment_results_${N}core.${experiment}`
See https://github.com/grpc/grpc/pull/33907 for the scenario config.
This PR fixes the bootstrap generator interop test by making the node
metadata flag dependent on version, which was causing a breakage
previously as all bootstrap generator version's don't necessarily
support the deexpiermentalized flag.
- Make `Value` a simple wrapper around `Pointer` and use some blessed
vtables to distinguish strings vs ints vs actual pointers - this saves 8
bytes per value stored
- introduce `RcString` as a lightweight container around an immutable
string - this saves some bytes vs the shared_ptr<std::string> approach
we previously had, and importantly opens up the technique (via
`RcStringValue`) to channel node keys also, which should increase
sharing and consequently also decrease total memory usage
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Refactored OpenCensus context propagation flow, now propagation happens
for each call and context will be automatically propagated from gRPC
server to gRPC client.
We're using `execution_context` in OpenCensus since the context is
related to OpenCensus and it helps wrap `contextVar` for us.
### Testing
* Added a new Bazel test case for context propagation.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This PR covers C++ only. Other language owners: feel free to ping me if
this would be useful for you.
This allows scenario_config to produce a small superset of tests
required to generate the performance dashboard
https://grafana-dot-grpc-testing.appspot.com/. This only adds a category
to some existing scenarios, and to my knowledge, should not affect any
current automation that uses scenario_config.
I plan to use this category to run gRPC-core experiments and produce
equivalent dashboards. It seems worth landing this independently of
those job configurations, but I'm flexible.
---------
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
Inserts and removals create `O(log(n))` new nodes with a persistent AVL
- which is nice - but if there's ultimately no mutation even this is
wasteful. Do some extra work in channel args to verify that there is
indeed a mutation, otherwise continue to share the same underlying
object.
Reduces node size from 112 bytes to 88 bytes on x64 opt builds.
(also delete the unused specialization of `AVL<T, void>`)
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
The most important change here is to setting `resource_suffix` and
`server_xds_port` flags to "generate randomly" by default. Previously we
were suggesting static values, and devs ended up with resource conflict
errors.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Fix: #33643.
This change adds `__reduce__` to `AioRpcError` to allow pickle.
### Testing
Added bazel unit test, without this change, test will fail with error:
```
TypeError: AioRpcError.__init__() missing 3 required positional arguments: 'code', 'initial_metadata', and 'trailing_metadata'
```
CNR a WindowsEventEngine listener flake in:
* 10k local Windows development machine runs
* 50k Windows RBE runs
* 10k Windows VM runs
It fails ~5 times per day on the master CI jobs.
This PR adds some logging to try to see if an edge is missed, and
switches the thread pool implementation to see if that makes the flake
go away. If the flakes disappear, I'll try removing one or the other to
see if either independently fix the problem (hopefully not logging).
---------
Co-authored-by: drfloob <drfloob@users.noreply.github.com>