Local Bazel invocation succeeds:
```
C:\Users\yijiem\projects\grpc>bazel --output_base=C:\bazel2 test --dynamic_mode=off --verbose_failures //test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1
INFO: Analyzed target //test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1 (0 packages loaded, 0 targets configured).
INFO: Found 1 test target...
Target //test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1 up-to-date:
bazel-bin/test/cpp/naming/resolver_component_tests_runner_invoker@poller=epoll1.exe
INFO: Elapsed time: 199.262s, Critical Path: 193.48s
INFO: 2 processes: 1 internal, 1 local.
INFO: Build completed successfully, 2 total actions
//test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1 PASSED in 193.4s
Executed 1 out of 1 test: 1 test passes.
```
The local invocation of RBE failed with linker error `LINK : error
LNK2001: unresolved external symbol mainCRTStartup`, but that does not
limited to this target:
https://gist.github.com/yijiem/2c6cbd9a31209a6de8fd711afbf2b479.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Update from gtcooke94:
This PR adds support to build gRPC and it's tests with OpenSSL3. There were some
hiccups with tests as the tests with openssl haven't been built or exercised in a
few months, so they needed some work to fix.
Right now I expect all test files to pass except the following:
- h2_ssl_cert_test
- ssl_transport_security_utils_test
I confirmed locally that these tests fail with OpenSSL 1.1.1 as well,
thus we are at least not introducing regressions. Thus, I've added compiler directives around these tests so they only build when using BoringSSL.
---------
Co-authored-by: Gregory Cooke <gregorycooke@google.com>
Co-authored-by: Esun Kim <veblush@google.com>
- add debug-only `WorkSerializer::IsRunningInWorkSerializer()` method
and use it in client_channel to verify that subchannels are destroyed in
the `WorkSerializer`
- note: this mechanism uses `std:🧵:id`, so I had to exclude
work_serializer.cc from the core_banned_constructs check
- fix `WorkSerializer::Run()` to unref the callback before releasing
ownership of the `WorkSerializer`, so that any refs captured by the
`std::function<>` will be released before releasing ownership
- fix the WRR timer callback to hop into the `WorkSerializer` to release
its ref to the picker, since that transitively releases refs to
subchannels
- fix subchannel connectivity state notifications to unref the watcher
inside the `WorkSerializer`, since the watcher often transitively holds
refs to subchannels
Basically run each of the subtests (buildtest, distribtest_cpp,
distribtest_python) as a separate bazel target.
- currently the bazel distribtest are the slowest targets in
grpc_bazel_rbe_nonbazel
- the shards are basically independent tests anyway
- when split into multiple targets, they each get a separate target log
so it's easier debug issues since there isn't multiple bazel invocations
in each log.
CNR the flake, but I've changed the test (which is very old) to use some
of our more modern helper functions that have saner timeouts.
Also re-add a `return` statement that was accidentally removed in
#33753, which I noticed while working on this. Its absence doesn't cause
a real problem, but it does cause us to needlessly trigger a duplicate
connection attempt or report a duplicate CONNECTING update in some
cases.
Proposed alternative to https://github.com/grpc/grpc/pull/34024.
This version has a simpler, faster busy-count implementation based on a
sharded set of atomic counts: fast increment/decrement operations,
relatively slower summation of total counts (which need to happen much
less frequently).
WRR is showing a very high CPU cost relative to previous solutions, and
it's unclear why this is.
Add two metrics that should help us see the shape of the subchannel sets
that are being passed to high cost systems in order to confirm/deny
theories.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
De-experiment pick first since we have both affinity and randomness E2E
test running successfully.
---------
Co-authored-by: Yash Tibrewal <yashkt@google.com>
The current `py_grpc_library` results in the wrong grpc proto python
code path when grpc is a third-party source code in a Bazel project.
This PR should fix it.
fixes#31011
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: Richard Belleville <rbellevi@google.com>
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This PR is mainly a set of improvements that allow the C++ Alarm to be
migrated away from legacy iomgr. It cannot be landed without significant
speedup, due to third-parties relying on a fast path for immediate timer
execution with deadlines <= now.
Previous EventEngine performance of bm_alarm, compared to baseline iomgr
timers: *0.014%*
This PR: *2.5%*
Regarding previous failures to land this change: The cloud libraries
team agreed to reduce the amount of stress in their alarm stress test
https://github.com/googleapis/google-cloud-cpp/pull/12378
Our current implementation of Join, TryJoin leverage some complicated
template stuff to work, which makes them hard to maintain. I've been
thinking about ways to simplify that for some time and had something
like this in mind - using a code generator that's at least a little more
understandable to code generate most of the complexity into a file that
is checkable.
Concurrently - I have a cool optimization in mind - but it requires that
we can move promises after polling, which is a contract change. I'm
going to work through the set of primitives we have in the coming weeks
and change that contract to enable the optimization.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Fix sticky-TF behavior such that once we enter TRANSIENT_FAILURE, we do
not leave that state if we get a new address list.
Also, fix handling of subchannels in state TRANSIENT_FAILURE.
Previously, if a subchannel was already in state TRANSIENT_FAILURE when
we wanted to start a connection attempt on it (e.g., because the
subchannel already existed from a different channel, or because it
already existed in the previous subchannel list), we would wait for it
to report IDLE before attempting to connect. This PR changes pick_first
to instead immediately skip the subchannel and move on to the next one.
Now, the only time we wait for a subchannel in TRANSIENT_FAILURE is when
we wrap back around to the first subchannel in the list.
Port https://github.com/grpc/grpc/pull/33871 to EE's
GrpcPolledFdFactoryPosix.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Not adding CMake support yet
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Our current implementation of `Seq`, `TrySeq` leverage some complicated
template stuff to work, which makes them hard to maintain. I've been
thinking about ways to simplify that for some time and had something
like this in mind - using a code generator that's at least a little more
understandable to code generate most of the complexity into a file that
is checkable.
Concurrently - I have a cool optimization in mind - but it requires that
we can move promises after polling, which is a contract change. I'm
going to work through the set of primitives we have in the coming weeks
and change that contract to enable the optimization.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
We decided to not populate `policy_name` with the HTTP filter name in
xDS case. So removing it from `GenerateServiceConfig`. This will be
consistent across languages. The gRFC
[PR](https://github.com/grpc/proposal/pull/346) has been updated.
I have a cool optimization in mind for promises - but it requires that
we can move them after polling, which is a contract change. I'm going to
work through the set of primitives we have in the coming weeks and
change that contract to enable the optimization.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This PR adds in delegating call tracers that work like so -
If this is the first call tracer that is being added onto the call
context, just add it as earlier.
If this is the second call tracer that is being added onto the call
context, create a delegating call tracer that contains a list of call
tracers (including the first call tracer).
Any more call tracers added, just get added to the list of tracers in
the delegating call tracer.
(This is not yet used other than through tests.)
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This doesn't matter for gtest (it does its own sorting later), but for
the fuzzers this will probably save a bit of churn in the corpus and
lead to faster discovery of interesting cases.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
I'm fairly certain that this path should be non-blocking (and making it
so makes the promise based code far more tractable).
This moves the blocking behavior into the blocking server_cc.cc function
that calls `grpc_server_shutdown_and_notify` instead of in that
non-blocking function.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
There's *very* little difference in cost for a pooled arena allocation
and a tcmalloc allocation - but the pooled allocation causes memory
stranding for call lifetime, whereas the tcmalloc allocation allows that
to be shared between calls - leading to a much lower overall cost.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Add "bazelified" non-bazel tests. See tools/bazelify_tests/README.md for
the core idea.
- add a bunch of test targets that run under docker and execute tests
that correspond to `run_tests.py -l LANG ...`
- many more tests can be added in the future
- to enable running some of the C/C++ portability tests easily, added
support for `--cmake_extra_configure_args` in run_tests.py (the change
is fairly small).
Example passing build that shows how test results are structured:
https://source.cloud.google.com/results/invocations/21295351-a3e3-4be1-b6e9-aaf52195a044/targets