New source of truth: https://github.com/grpc/psm-interop.
This PR removes PSM Interop framework source code from `tools/run_tests/xds_k8s_test_driver`, and all references to it.
Closes#35466
PiperOrigin-RevId: 597636949
Recently two more openssl tests were added to the portability test suite. At-head tests are using the same set, having an unintended big surge in the test time, causing timeout. So I've changed at-head tests not to run openssl tests.
Closes#35520
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35520 from veblush:at-head-diet d0fc79d7f9
PiperOrigin-RevId: 597634232
`distribtest.cpp_linux_x64_debian10_aarch64_cross_cmake_aarch64_cross` has been timed out recently about 50% hitting 45 min deadline so let's bump this to 60 mins. (The timeout for windows is bumped as well for consistency)
Closes#35479
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35479 from veblush:long-cross-aarch64 8ad82d684c
PiperOrigin-RevId: 597007435
Make sure there is no unnecessary delays when there are multiple reports in the queue.
This change also adds a test for the custom LB policy.
Closes#35467
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35467 from eugeneo:tasks/orca-test-timeout-316026521 4aab50a118
PiperOrigin-RevId: 597007131
Continue supporting the current grpc-testing that I suppose is used
inside of Google, but also allow to configure a different project to
upload results to.
The format "project_id.dataset_id.table_id" is common for BigQuery so it
seems idiomatic to do it in this way. Adding a separate command line
option would be more complicated because it would require changes all
the way down the chain (at least in the entry point for the test driver
and in the LoadTest controller).
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#35384
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35384 from lepistone:choose_bigquery_project 2355fea28c
PiperOrigin-RevId: 595523944
This PR adds CSM Observability testing capability in the PSM Interop testing framework. This PR mostly changes the framework Python code.
This adds a flag `enable_csm_observability` to the client / server deployment yaml file such that, when enabled, we will create a GMP `PodMonitoring` resource and pass the `--enable_csm_observability` to each language's client / server container (for them to actually enable the Prometheus endpoint)
I added a new test under `tests/csm/csm_observability_test.py`. This is basically a copy of the `tests/baseline_test.py` but with the `enable_csm_observability=True`.
Other PRs for this whole thing to work:
- https://github.com/grpc/grpc/pull/34752: The `PodMonitoring` resource yaml template
- https://github.com/grpc/grpc/pull/34832: Support for the `--enable_csm_observability` flag in the C++ client/server image
Closes#34835
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/34835 from stanley-cheung:csm-o11y-framework-changes 0b3d0eb7ed
PiperOrigin-RevId: 595502496
- `memory_pressure_controller` finally - allows deletion of pid_controller throughout the codebase
- `overload_protection` - one of the http2 rapid reset mitigations
- `red_max_concurrent_streams` - another http2 rapid reset mitigation
Closes#35426
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35426 from ctiller:new-years-cleanse 4651672e7e
PiperOrigin-RevId: 595205029
Remove the old `switch` library - this used to be an implementation detail of `Seq`, `TrySeq` - but has become unused.
Add a new user facing primitive `Switch` that fills a similar role to `switch` in C++ - selecting a promise to execute based on a primitive discriminator - much like `If` allows selection based on a boolean discriminator now.
A future change will optimize this to actually lower the `Switch` into an actual `switch` statement, but for right now I want to get the functionality in.
Closes#35424
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35424 from ctiller:switchy 5308a914c6
PiperOrigin-RevId: 595140965
This is a prerequisite change to start supporting Bazel 7. Changes are
- Disabled bzlmod which Bazel 7 begins to enable by default. This eventually needs to be done to support bzlmod but not now.
- Upgraded some bazel rule dependencies which are required to support Bazel 7.
- Using Python 3 explcitly as Bazel 7 begins to reject Python 2.
Note that this isn't enough to enable Bazel 7 by default and another PR will follow for that.
Closes#35374
PiperOrigin-RevId: 592931675
Simple `assert` statements don't help much to know what needs to be done. Instead, explicit error messages will let us know what's wrong which is helpful to know what to look at.
Closes#35375
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35375 from veblush:check-work 0733499c31
PiperOrigin-RevId: 592920747
Fix: https://github.com/grpc/grpc/issues/35085
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#35325
PiperOrigin-RevId: 592635611
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#35280
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35280 from yashykt:UpdateInteropScriptForFindingAdsChannel db213384b4
PiperOrigin-RevId: 591090750
`AllOk` runs a set of promises concurrently, and like `TryJoin` waits for them all to succeed or one to fail.
Unlike `TryJoin` it returns a single unified status of the composition, so cannot handle member promises that might return `StatusOr` or the like.
Closes#35304
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35304 from ctiller:all-review 30f5f809c6
PiperOrigin-RevId: 591031189
Enable OpenSSL 1.0.2 tests and add a container for 1.1.1 so that it is tested during portability testing as well.
Closes#35236
PiperOrigin-RevId: 590345568
### Changes in this PR
* Refactor and remove some Core/C++ dependencies to simplify Python Observability package build process.
* Refactored code to read config at Python layer.
* Enable observability build from source.
* Add observability to run_test.
* Currently it's only enabled in Linux.
* Add error handler in run_test loaders.
* Current framework will always visit modules in test directory then decide which tests to skip.
* Since we're not building Observability for MacOS and Windows this step will fail with error `No module named 'grpc_observability'`.
* After the change we'll just skip those modules.
* We still have `_sanity_test` to make sure all tests are loaded correctly for each platform.
* Remov OC dependency as we're migrating to OTel.
* Also removed trace from testing.
* Note that trace propagation function was also removed because of this.
### Testing
* Passed existing tests.
* Tested locally, able to build observability from source using `GRPC_PYTHON_BUILD_WITH_CYTHON=1 pip install .`.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#34207
PiperOrigin-RevId: 590258014
Starting from Python 3.11, the pipes module produces this warning:
DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
Turns out that in this repo the pipes module is only used for the
"quote" function which is turn directly taken from the shlex module [1].
The shlex module is not deprecated as of today and is already used in
other places in this repo. The function shlex.quote has been around
since the ancient Python 3.3.
[1] https://github.com/python/cpython/blob/3.11/Lib/pipes.py#L64-L66
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#34941
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/34941 from lepistone:deprecate-python-pipes 233c54c135
PiperOrigin-RevId: 588883480
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#35153
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35153 from yijiem:native_dns_resolver a4107f7d81
PiperOrigin-RevId: 588543137
Removes noise from the cleanup/teardown ops.
#### GCP APIs
In GCP APIs, change log level for delete operations that failed because the resource doesn't exist (API 404) from `info` to `debug`. Framework's logging philosophy is to only log external operations (e.g. APIs, RPCs). If no error logged, the op is assumed successful.
In the deletion case, is still possible to discriminate between whether the op was actually performed by observing the `Waiting %s sec for %s operation id: %s` log message.
#### K8s APIs
In K8s APIs:
- For delete operations that failed because the resource doesn't exist (API 404) the log level is changed from `info` to `debug`
- For delete operations that failed for any other reason, the log level is changed from `info` to `warning`
- When `wait_for_deletion` is enabled (it's the default) the delete operation will be confirmed with `logger.info("<resource_kind> %s deleted", name)`. Previously it logged at the `debug` level.
Closes#35131
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/35131 from sergiitk:psm-interop-debug-log-on-delete-404 f6629e5132
PiperOrigin-RevId: 587851692
We dont need this check anymore .
Deleting the check from the yaml and the sh file.
<!--
If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the appropriate
lang label.
-->
Closes#35161
PiperOrigin-RevId: 587784923
Add a variant of `Spawn` that returns a promise that can be awaited by another activity.
This allows us to simply implement complex cross-activity synchronization.
(necessary building block for #34740)
Also adds an inter-activity latch as a building block to test this work.
Closes#34744
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/34744 from ctiller:ninteen-ninety-nine 19074b255f
PiperOrigin-RevId: 582450643
`StatusFlag` acts like a status, but is just a boolean (we don't want to
accidentally treat a boolean as something that indicates failure in case
it's not)
Similarly `ValueOrFailure` looks like `StatusOr` but reduces the failure
space to one value.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
We're seeing timeout errors in our distribution test:
https://fusion2.corp.google.com/invocations/dfa9aaa9-e94b-479e-8c28-a39d98d277bc/targets/github%2Fgrpc%2Fbuild_artifacts_python;config=default/tests.
Sample error:
`2023-11-10 09:12:19,512 TIMEOUT:
build_artifact.python_windows_x86_Python39_32bit [pid=2320,
time=2700.1sec]`
This change increases timeout for windows build artifact jobs to 7200s,
which aligns with all other jobs (except `linux_extra`, which is 3600s).
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This commit upgrades gRPC to protobuf v25.0 and makes some fixes to
account for upb changes. One major change is that upb has been merged
into the protobuf repo, so we can now drop the separate `@upb`
dependency. Another is that `.upb.c` files no longer exist and there are
new `.upb_minitable.h` and `.upb_minitable.c` files. The longer
filenames exceeded a Windows restriction, so to work around that I
renamed the `upb-generated` directory to just `upb-gen`, and likewise
for `upbdefs-generated`.
Modeled after mutexes in the Rust ecosystem: the mutex owns the data
provided, and acquisition of the mutex returns a handle with which to
manipulate that data.
This fits in nicely with the execution environment we've established
whereby we may want to pass the lock from lambda to lambda for some
time.
---------
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
This is a follow-up PR of #34191, which handles the error condition of
endpoints failed to write/read in chaotic-good client transport.
This PR needs to be merged after #34191.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: Bradley Hess <bdhess@google.com>
Co-authored-by: AJ Heller <hork@google.com>
Now we log pretty much identical message:
```
client_app.py:320] [psm-grpc-client-7768f6597-nvtgl] Detected successful calls to xDS control plane: trafficdirector.googleapis.com:443
client_app.py:292] [psm-grpc-client-7768f6597-nvtgl] ADS: Detected successful calls to xDS control plane trafficdirector.googleapis.com:443
```
This PR will log the latest channel state in the first message, similar
to what we do in `find_server_channel_with_state`:
52c08f4498/tools/run_tests/xds_k8s_test_driver/framework/test_app/client_app.py (L367-L371)
After the change:
```
client_app.py:320] [psm-grpc-client-6566595cff-8wrfd] Detected successful calls to xDS control plane trafficdirector.googleapis.com:443, channel: <Channel channel_id=4 target=trafficdirector.googleapis.com:443 call_started=9 calls_failed=8 state=READY>
client_app.py:292] [psm-grpc-client-6566595cff-8wrfd] ADS: Detected successful calls to xDS control plane trafficdirector.googleapis.com:443
```
This adds the directory reloader implementation of the CrlProvider. This
will periodically reload CRL files in a directory per [gRFC
A69](https://github.com/grpc/proposal/pull/382)
Included in this is the following:
* A public API to create the `DirectoryReloaderCrlProvider`
* A basic directory interface in gprpp and platform specific impls for
getting the list of files in a directory (unfortunately prior C++17,
there is no std::filesystem, so we have to have platform specific impls)
* The implementation of `DirectoryReloaderCrlProvider` takes an
event_engine and a directory interface. This allows us to test using the
fuzzing event engine for time mocking, and to implement a test directory
interface so we avoid having to make temporary directories and files in
the tests. This is notably not in `include`, and the
`CreateDirectoryReloaderCrlProvider` is the only way to construct one
from the public API, so we don't expose the event engine and directory
details to the user.
---------
Co-authored-by: gtcooke94 <gtcooke94@users.noreply.github.com>
This PR fixes a bug identified in #29667, where the TLS channel
credentials still require a trust bundle even if the user has explicitly
opted to not verify the server certificate. This PR is based on #29810.
Add a `PodMonitoring` resource type to the PSM interop testing
framework. This is needed so that GMP (Google Managed Prometheus) can
scrape the matching GKE pods Prometheus endpoint for Prometheus metrics.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
In case of test fails, the clean up script will try delete some resource
we didn't create and resulting lots of 404 errors, we should exclude
those status code since we have specific handling for 404.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->