This makes the resolver component tests suite run on Window RBE by
adding a flag in the test driver to further differentiate between Bazel
local run and Bazel RBE run on Windows since they have different
RUNFILES behavior.
Local Bazel run succeeds:
```
C:\Users\yijiem\projects\grpc>bazel --output_base=C:\bazel2 test --dynamic_mode=off --verbose_failures --test_arg=--running_locally=true //test/cpp/naming:resolver_component_tests_runner_invoker
INFO: Analyzed target //test/cpp/naming:resolver_component_tests_runner_invoker (0 packages loaded, 0 targets configured).
INFO: Found 1 test target...
Target //test/cpp/naming:resolver_component_tests_runner_invoker up-to-date:
bazel-bin/test/cpp/naming/resolver_component_tests_runner_invoker.exe
INFO: Elapsed time: 196.080s, Critical Path: 193.21s
INFO: 2 processes: 1 internal, 1 local.
INFO: Build completed successfully, 2 total actions
//test/cpp/naming:resolver_component_tests_runner_invoker PASSED in 193.1s
Executed 1 out of 1 test: 1 test passes.
```
RBE run succeeds:
```
C:\Users\yijiem\projects\grpc>bazel --bazelrc=tools/remote_build/windows.bazelrc test --config=windows_opt --dynamic_mode=off --verbose_failures --host_linkopt=/NODEFAULTLIB:libcmt.lib --host_linkopt=/DEFAULTLIB:msvcrt.lib --nocache_test_results //test/cpp/naming:resolver_component_tests_runner_invoker
INFO: Invocation ID: d467f2e3-7da6-4bb5-8b9b-84f1181ebc60
WARNING: --remote_upload_local_results is set, but the remote cache does not support uploading action results or the current account is not authorized to write local results to the remote cache.
INFO: Streaming build results to: https://source.cloud.google.com/results/invocations/d467f2e3-7da6-4bb5-8b9b-84f1181ebc60
INFO: Analyzed target //test/cpp/naming:resolver_component_tests_runner_invoker (0 packages loaded, 133 targets configured).
INFO: Found 1 test target...
Target //test/cpp/naming:resolver_component_tests_runner_invoker up-to-date:
bazel-bin/test/cpp/naming/resolver_component_tests_runner_invoker.exe
INFO: Elapsed time: 41.627s, Critical Path: 39.42s
INFO: 2 processes: 1 internal, 1 remote.
//test/cpp/naming:resolver_component_tests_runner_invoker PASSED in 33.0s
Executed 1 out of 1 test: 1 test passes.
INFO: Streaming build results to: https://source.cloud.google.com/results/invocations/d467f2e3-7da6-4bb5-8b9b-84f1181ebc60
INFO: Build completed successfully, 2 total actions
```
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This reverts commit fe1ba18dfc.
Reason: break import
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This helps developers run benchmark loadtests locally. See comments in
scenario_runner.py for usage.
---------
Co-authored-by: drfloob <drfloob@users.noreply.github.com>
Previously black wouldn't install, as it required newer `packaging`
package.
This fixes `pip install -r requirements-dev.txt`. In addition, `black`
in dev dependencies file is changed to `black[d]`, which bundles
`blackd` binary (["black as a
server"](https://black.readthedocs.io/en/stable/usage_and_configuration/black_as_a_server.html)).
The `work_stealing` experiment on its own is not very valuable, so let's
delete it and save CI resources. We have a benchmark for
`GRPC_EXPERIMENTS=event_engine_listener,work_stealing`, which is really
what we care about right now.
Fixes an issue when an active context selected automatically picked up
as context for `secondary_k8s_api_manager`.
This was introducing an error in GAMMA Baseline PoC
```
sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('100.71.2.143', 56723), raddr=('35.199.174.232', 443)>
```
Here's how the secondary context is incorrectly falls back to the
default context when `--secondary_kube_context` is not set:
```
k8s.py:142] Using kubernetes context "gke_grpc-testing_us-central1-a_psm-interop-security", active host: https://35.202.85.90
k8s.py:142] Using kubernetes context "None", active host: https://35.202.85.90
```
This WorkStealingThreadPool change improves the (lagging)
`cpp_protobuf_async_streaming_qps_unconstrained_secure` 32-core
benchmark.
Baseline OriginalThreadPool QPS: 830k
Previous average WorkStealingThreadPool QPS: 755k
New WorkStealingThreadPool average (2 runs) QPS: 850k
We enabled OpenSSL3 testing with #31256 and missed a failing test
It wasn't running before, so this isn't a regression - disabling it so
master doesn't fail while we figure out how to fix it.
- Add Github Action to conditionally run PSM Interop unit tests:
- Only run when changes are detected in
`tools/run_tests/xds_k8s_test_driver` or any of the proto files used by
the driver
- Only run against PRs and pushes to `master`, `v1.*.*` branches
- Runs using `python3.9` and `python3.10`
- Ready to be added to the list of required GitHub checks
- Add `tools/run_tests/xds_k8s_test_driver/tests/unit/__main__.py` test
loader that recursively discovers all unit tests in
`tools/run_tests/xds_k8s_test_driver/tests/unit`
- Add basic coverage for `XdsTestClient` and `XdsTestServer` to verify
the test loader picks up all folders
Related:
- First unit tests without automated CI added in #34097
The tests are skipped incorrectly because `config.server_lang` is
incorrectly compared with the string value "java", instead of
`skips.Lang.JAVA`.
This has been broken since #26998.
```
xds_url_map_testcase.py:372] ----- Testing TestTimeoutInRouteRule -----
xds_url_map_testcase.py:373] Logs timezone: UTC
skips.py:121] Skipping TestConfig(client_lang='java', server_lang='java', version='v1.57.x')
[ SKIPPED ] setUpClass (timeout_test.TestTimeoutInRouteRule)
xds_url_map_testcase.py:372] ----- Testing TestTimeoutInApplication -----
xds_url_map_testcase.py:373] Logs timezone: UTC
skips.py:121] Skipping TestConfig(client_lang='java', server_lang='java', version='v1.57.x')
[ SKIPPED ] setUpClass (timeout_test.TestTimeoutInApplication)
```
Add example for TLS.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Some versions of MacOS (as well as some `glibc`-based platforms) require
`__STDC_FORMAT_MACROS` to be defined prior to including `<cinttypes>` or
`<inttypes.h>`, otherwise build fails with undeclared `PRIdMAX`,
`PRIdPTR` etc.
See note here: https://en.cppreference.com/w/cpp/types/integer
label: (release notes: no)
This is to make sure upgrading packaging module won't break our logic on
version-based version skipping.
This also fixes a small issue with `dev-` prefix - it should only be
allowed on the left side of the comparison.
Context: packaging module needs to be upgraded to be compatible with
`blackd`.
Local Bazel invocation succeeds:
```
C:\Users\yijiem\projects\grpc>bazel --output_base=C:\bazel2 test --dynamic_mode=off --verbose_failures //test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1
INFO: Analyzed target //test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1 (0 packages loaded, 0 targets configured).
INFO: Found 1 test target...
Target //test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1 up-to-date:
bazel-bin/test/cpp/naming/resolver_component_tests_runner_invoker@poller=epoll1.exe
INFO: Elapsed time: 199.262s, Critical Path: 193.48s
INFO: 2 processes: 1 internal, 1 local.
INFO: Build completed successfully, 2 total actions
//test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1 PASSED in 193.4s
Executed 1 out of 1 test: 1 test passes.
```
The local invocation of RBE failed with linker error `LINK : error
LNK2001: unresolved external symbol mainCRTStartup`, but that does not
limited to this target:
https://gist.github.com/yijiem/2c6cbd9a31209a6de8fd711afbf2b479.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Update from gtcooke94:
This PR adds support to build gRPC and it's tests with OpenSSL3. There were some
hiccups with tests as the tests with openssl haven't been built or exercised in a
few months, so they needed some work to fix.
Right now I expect all test files to pass except the following:
- h2_ssl_cert_test
- ssl_transport_security_utils_test
I confirmed locally that these tests fail with OpenSSL 1.1.1 as well,
thus we are at least not introducing regressions. Thus, I've added compiler directives around these tests so they only build when using BoringSSL.
---------
Co-authored-by: Gregory Cooke <gregorycooke@google.com>
Co-authored-by: Esun Kim <veblush@google.com>
- add debug-only `WorkSerializer::IsRunningInWorkSerializer()` method
and use it in client_channel to verify that subchannels are destroyed in
the `WorkSerializer`
- note: this mechanism uses `std:🧵:id`, so I had to exclude
work_serializer.cc from the core_banned_constructs check
- fix `WorkSerializer::Run()` to unref the callback before releasing
ownership of the `WorkSerializer`, so that any refs captured by the
`std::function<>` will be released before releasing ownership
- fix the WRR timer callback to hop into the `WorkSerializer` to release
its ref to the picker, since that transitively releases refs to
subchannels
- fix subchannel connectivity state notifications to unref the watcher
inside the `WorkSerializer`, since the watcher often transitively holds
refs to subchannels
Basically run each of the subtests (buildtest, distribtest_cpp,
distribtest_python) as a separate bazel target.
- currently the bazel distribtest are the slowest targets in
grpc_bazel_rbe_nonbazel
- the shards are basically independent tests anyway
- when split into multiple targets, they each get a separate target log
so it's easier debug issues since there isn't multiple bazel invocations
in each log.
Currently the bazel invocations provide a link back to the original
kokoro jobs for resultstore and sponge UIs.
This adds another back link to Fusion UI, which has the advantage of
- being able to navigate to the kokoro job overview
- there is a button to trigger a new build (in case the job needs to be
re-run).
![image](https://github.com/grpc/grpc/assets/9939684/42f0bbc2-14f5-45db-89f7-73e48e32e7c9)
CNR the flake, but I've changed the test (which is very old) to use some
of our more modern helper functions that have saner timeouts.
Also re-add a `return` statement that was accidentally removed in
#33753, which I noticed while working on this. Its absence doesn't cause
a real problem, but it does cause us to needlessly trigger a duplicate
connection attempt or report a duplicate CONNECTING update in some
cases.
- Upgrade bazel
- Reduce the number of places where bazel version needs to be upgraded
in future.
- also make sure the list of bazel versions to test by bazelified tests
is loaded from supported_versions.txt (it was hardcoded before).
- ~~Try upgrading windows RBE build to bazel 6.3.2 as well.~~
The core idea:
- the source of truth for supported bazel versions is in
`bazel/supported_versions.txt`
- the first version listed in `bazel/supported_versions.txt` is
considered to be the "primary" bazel version and is going to be used in
most places thoroughout the repo.
- use templates to include the primary bazel version in testing
dockerfiles and in a newly introduced `.bazelversion` files (which gets
loaded by our existing `tools/bazel` wrapper).
~~Supersedes https://github.com/grpc/grpc/pull/33880~~
Proposed alternative to https://github.com/grpc/grpc/pull/34024.
This version has a simpler, faster busy-count implementation based on a
sharded set of atomic counts: fast increment/decrement operations,
relatively slower summation of total counts (which need to happen much
less frequently).
WRR is showing a very high CPU cost relative to previous solutions, and
it's unclear why this is.
Add two metrics that should help us see the shape of the subchannel sets
that are being passed to high cost systems in order to confirm/deny
theories.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
- Remove no-longer-needed workaround for b/275571385 (and switch back to
using docker image from GAR)
- regenerate RBE linux toolchain for bazel 6.1.2 while at it.
De-experiment pick first since we have both affinity and randomness E2E
test running successfully.
---------
Co-authored-by: Yash Tibrewal <yashkt@google.com>
Needs https://github.com/grpc/grpc/pull/34035 to be merged first. With
newer ccache in available in the gcc12 test image, the build is now
faster so we can reenable the test.
This should get the benchmarks running again. The dotnet benchmark is
broken (unclear if it's still necessary), and the grpc-go benchmark
build currently fails. The go benchmark should be re-enabled when the
dockerfiles are fixed. The rest of the dotnet benchmark configuration /
artifacts should be deleted or fixed as well. @jtattermusch
Based on https://github.com/grpc/grpc/pull/34033
Bunch of cleanup and rebuilding many docker images from scratch
- consolidate the workaround for "dubious ownership" issue reported by
git. Other team members have run into this recently and used similar but
not identical workarounds so some cleanup is due.
- rebuilding many images increases the chance that we fix the "dubious
ownership" git issue early on rather than later on in the one-at-a-time
fashion in the future (and the former will prevent many teammembers from
wasting time on this weird issue).
- Newer version of ccache is needed for some portability tests to be
able to benefit from caching (e.g. the GCC 12 portability test to get
benefits of local disk caching) - this is a prerequisite for reenabling
the bazelified gcc12 portability test.
- upgrade node interop images to debian:11 (since debian jessie is long
past EOL).
The current `py_grpc_library` results in the wrong grpc proto python
code path when grpc is a third-party source code in a Bazel project.
This PR should fix it.
fixes#31011
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: Richard Belleville <rbellevi@google.com>