<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
@sampajano
Previously, we didn't configure the failureThreshold, so it used its
default value. The final `startupProbe` looked like this:
```json
{
"startupProbe": {
"failureThreshold": 3,
"periodSeconds": 3,
"successThreshold": 1,
"tcpSocket": {
"port": 8081
},
"timeoutSeconds": 1
}
```
Because of it, the total time before k8s killed the container was 3
times `failureThreshold` * 3 seconds wait between probes `periodSeconds`
= 9 seconds total (±3 seconds waiting for the probe response).
This greatly affected PSM Security test server, some implementations of
which waited for the ADS stream to be configured before starting
listening on the maintenance port. This lead for the server container
being killed for ~7 times before a successful startup:
```
15:55:08.875586 "Killing container with a grace period"
15:53:38.875812 "Killing container with a grace period"
15:52:47.875752 "Killing container with a grace period"
15:52:38.874696 "Killing container with a grace period"
15:52:14.874491 "Killing container with a grace period"
15:52:05.875400 "Killing container with a grace period"
15:51:56.876138 "Killing container with a grace period"
```
These extra delays lead to PSM security tests timing out.
ref b/277336725
This maybe used to quickly verify the code coverage of a modified test
locally (e.g. fuzzer).
Example:
```
# Build and run target; the raw profile will be written to $LLVM_PROFILE_FILE when the program exits
$ bazel build --config=dbg --config=fuzzer_asan --config=coverage //test/core/end2end/fuzzers:api_fuzzer
$ LLVM_PROFILE_FILE="api_fuzzer.profraw" bazel-bin/test/core/end2end/fuzzers/api_fuzzer test/core/end2end/fuzzers/api_fuzzer_corpus/*
# Create coverage report
$ llvm-profdata-14 merge -sparse api_fuzzer.profraw -o api_fuzzer.profdata
$ llvm-cov-14 report ./bazel-bin/test/core/end2end/fuzzers/api_fuzzer --instr-profile=api_fuzzer.profdata
```
Sample report:
f94e444f25/gistfile1.txt
One trick is that the binary needs to be statically linked, e.g. specify
`linkstatic = 1` on the BUILD target.
See https://clang.llvm.org/docs/SourceBasedCodeCoverage.html for more
info.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
It already started hitting the limit resulting in continuous failure.
https://github.com/grpc/grpc/pull/32603 is believed to contribute to
this time increase but let's bump it first and visit this issue later.
The very non-trivial upgrade of third_party/protobuf to 22.x
This PR strives to be as small as possible and many changes that were
compatible with protobuf 21.x and didn't have to be merged atomically
with the upgrade were already merged.
Due to the complexity of the upgrade, this PR wasn't created
automatically by a tool, but manually. Subsequent upgraded of
third_party/protobuf with our OSS release script should work again once
this change is merged.
This is best reviewed commit-by-commit, I tried to group changes in
logical areas.
Notable changes:
- the upgrade of third_party/protobuf submodule, the bazel protobuf
dependency itself
- upgrade of UPB dependency to 22.x (in the past, we used to always
upgrade upb to "main", but upb now has release branch as well). UPB
needs to be upgraded atomically with protobuf since there's a de-facto
circular dependency (new protobuf depends on new upb, which depends on
new protobuf for codegen).
- some protobuf and upb bazel rules are now aliases, so `
extract_metadata_from_bazel_xml.py` and `gen_upb_api_from_bazel_xml.py`
had to be modified to be able to follow aliases and reach the actual
aliased targets.
- some protobuf public headers were renamed, so especially
`src/compiler` needed to be updated to use the new headers.
- protobuf and upb now both depend on utf8_range project, so since we
bundle upb with grpc in some languages, we now have to bundle utf8_range
as well (hence changes in build for python, PHP, objC, cmake etc).
- protoc now depends on absl and utf8_range (previously protobuf had
absl dependency, but not for the codegen part), so python's
make_grpcio_tools.py required partial rewrite to be able to handle those
dependencies in the grpcio_tools build.
- many updates and fixes required for C++ distribtests (currently they
all pass, but we'll probably need to follow up, make protobuf's and
grpc's handling of dependencies more aligned and revisit the
distribtests)
- bunch of other changes mostly due to overhaul of protobuf's and upb's
internal build layout.
TODOs:
- [DONE] make sure IWYU and clang_tidy_code pass
- create a list of followups (e.g. work to reenable the few tests I had
to disable and to remove workaround I had to use)
- [DONE in cl/523706129] figure out problem(s) with internal import
---------
Co-authored-by: Craig Tiller <ctiller@google.com>
Since we are adding an annotation for experiments internally, we need to
ensure the length of the annotation does not exceed 2 KB (used slightly
lower number in the check). Added a check to gen_experiments.py for
this.
Various fixes that can be merged in advance, to simplify the 22.x
upgrade.
- Use NOMINMAX for grpcio_tools build to avoid build failure when absl
is added as a dependency. Fixes
https://github.com/grpc/grpc/issues/32779
- in spawn_patch command file, put arguments on multiple lines to avoid
lines going over the limit for large link commands: Fixes
https://github.com/grpc/grpc/issues/32795
- Use `python/generator.h` instead of the deprecated
`python/python_generator.h` in grpcio_tools.
Followup for https://github.com/grpc/grpc/pull/32649 (which disabled the
tests mentioned below).
Also sets correct path for tests build by ninja on windows, so that they
don't get skipped.
Once merged, I'll backport to 1.54.x and 1.53.x
The original issue with tests being skipped.
```
+ python3 workspace_c_windows_dbg_native/tools/run_tests/run_tests.py -t -j 8 -x run_tests/c_windows_dbg_native/sponge_log.xml --report_suite_name c_windows_dbg_native -l c -c dbg --iomgr_platform native --bq_result_table aggregate_results --measure_cpu_costs
2023-03-20 07:56:53,523 START: tools\run_tests\helper_scripts\build_cxx.bat
2023-03-20 08:04:51,388 PASSED: tools\run_tests\helper_scripts\build_cxx.bat [time=477.9sec, retries=0:0; cpu_cost=0.0; estimated=1.0]
2023-03-20 08:04:52,434 detected port server running version 21
2023-03-20 08:04:52,672 my port server is version 21
2023-03-20 08:04:52,703 SUCCESS: All tests passed
WARNING: binary not found, skipping cmake/build/Debug/bad_server_response_test.exe
WARNING: binary not found, skipping cmake/build/Debug/connection_refused_test.exe
WARNING: binary not found, skipping cmake/build/Debug/goaway_server_test.exe
WARNING: binary not found, skipping cmake/build/Debug/invalid_call_argument_test.exe
WARNING: binary not found, skipping cmake/build/Debug/multiple_server_queues_test.exe
WARNING: binary not found, skipping cmake/build/Debug/no_server_test.exe
WARNING: binary not found, skipping cmake/build/Debug/pollset_windows_starvation_test.exe
WARNING: binary not found, skipping cmake/build/Debug/public_headers_must_be_c89.exe
```
Notes:
- `+trace` fixtures haven't run since 2016, so they're disabled for now
(7ad2d0b463 (diff-780fce7267c34170c1d0ea15cc9f65a7f4b79fefe955d185c44e8b3251cf9e38R76))
- all current fixtures define `FEATURE_MASK_SUPPORTS_AUTHORITY_HEADER`
and hence `authority_not_supported` has not been run in years - deleted
- bad_hostname similarly hasn't been triggered in a long while, so
deleted
- load_reporting_hook has never been enabled, so deleted
(f23fb4cf31/test/core/end2end/generate_tests.bzl (L145-L148))
- filter_latency & filter_status_code rely on global variables and so
don't convert particularly cleanly - and their value seems marginal, so
deleted
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
When using this internally, we noticed that it's impossible to use
custom_metadata.h without creating a dependency cycle between
:custom_metadata and :grpc_base.
A full build refactoring is too large right now, so merge that header
into :grpc_base for the time being.
Also, separate `SimpleSliceBasedMetadata` into its own file, so that it
can be reused in custom_metadata.h.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This PR also centralizes the client channel resolver selection. Resolver
selection is still done using the plugin system, but when the Ares and
native client channel resolvers go away, we can consider bootstrapping
this differently.
Built atop #31448
Offers a simple framework for testing filters.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Add the capability for api-fuzzer to fuzz over different config
variables, to enable us to spot incompatible configurations there
sooner.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Protobuf's HEAD has changed the layout of bazel targets we use for
getting the list of source files for our python grpcio.tools build.
With this patch, make_grpcio_tools.py finishes successfully and gives
reasonable results.
Aim here is to allow adding custom metadata types to the internal build.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This reverts commit 7bd9267f32.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
(hopefully last try)
Add new channel arg GRPC_ARG_ABSOLUTE_MAX_METADATA_SIZE as hard limit
for metadata. Change GRPC_ARG_MAX_METADATA_SIZE to be a soft limit.
Behavior is as follows:
Hard limit
(1) if hard limit is explicitly set, this will be used.
(2) if hard limit is not explicitly set, maximum of default and soft
limit * 1.25 (if soft limit is set) will be used.
Soft limit
(1) if soft limit is explicitly set, this will be used.
(2) if soft limit is not explicitly set, maximum of default and hard
limit * 0.8 (if hard limit is set) will be used.
Requests between soft and hard limit will be rejected randomly, requests
above hard limit will be rejected.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Earlier, we were simply using a 64 bit random number, but the spec
actually calls for UUIDv4.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This allows us to replace `absl::optional<TaskHandle>` with checks
against the invalid handle.
This PR also replaces the differently-named invalid handle instances
with a uniform way of accessing static invalid instances across all
handle types, which aids a bit in testing.
Add an event manager that spawns threads just as much as it possibly
can... to expose TSAN to the myriad thread ordering problems in our code
base.
Next steps for this will be to add a new test mode for tsan + thready
event engine + a few other doodads to increase threads in the system
(party.cc in particular has a good place for a hook).
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>