This reverts commit 1624542ea4, relanding
https://github.com/grpc/grpc/pull/32956
Because of some proto dependency and build problems internally, I've
removed the ServiceConfig proto fuzzing component. These build issues
can hopefully be resolved soon, and then we can re-add the deleted
implementation from commit
[b078c9c](b078c9c015)
in this PR.
each oob test runs at least 2s (2 verifications with each server sending
report each 1s), current configuration allows at most 3 clients run in
parallel. In reality, probably only the two clients against one server.
Currently 8 jobs are running each batch, and it happens that each batch
has at most 2 oob clients against one server.
Since 3 languages support orca, so we should allow 3 clients running
against one server so the timeout is at least 4s. We give some buffers
to allow tests running more reliably.
10s in theory support 5-6 clients running against one server.
`tools/run_tests/sanity/check_absl_mutex.sh` was broken, a missing paren
crashed the script if run locally. It's unclear yet how our sanity
checks were not complaining about this, `run_tests.py` does not save the
log.
Rare bug: server initial metadata gets stranded in the outbound pipe.
(fix is a little unpleasant, but we'll do better at the five pipes
stage)
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
re2 previously failed to compile if:
1. An old `re2` version is installed with a non-standard system prefix,
such as `/opt/local`.
2. The environment variable is set: `CPPFLAGS=-I/opt/local/include`.
Running `make` would result in function prototype mismatches because the
Makefile would previously attempt to use the headers from
`/opt/local/include/re2` before the `third_party/re2/re2` directory.
https://github.com/grpc/grpc/pull/27660 caused `CPPFLAGS` to inherit
from the environment, but this can cause the Makefile to use external
include files for re2 and other libraries if `-I` flags are defined.
This commit reverts to the original behavior of only using
`RbConfig::CONFIG` values to avoid using the wrong headers.
I've noticed we add the cleanup hook after setting up the
infrastructure. Thus, if infra setup failed, the cleanup won't work.
This fixes it, and adds extra checks to not call
`cls.test_client_runner` if it's not set.
The logger uses `absl::FPrintF` to write to stdout. After reading a
number of sources online, I got the impression that `std::fwrite` which
is used by `absl::FPrintF` is atomic so there is no locking required
here.
---------
Co-authored-by: rockspore <rockspore@users.noreply.github.com>
Fail test if client or server pods restarted during test.
#### Testing
Tested locally, test will fail with message similar to:
```
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/google/home/xuanwn/workspace/xds/grpc/tools/run_tests/xds_k8s_test_driver/framework/xds_k8s_testcase.py", line 501, in tearDown
))
AssertionError: 5 != 0 : Server pods unexpectedly restarted {sever_restarts} times during test.
----------------------------------------------------------------------
Ran 1 test in 886.867s
```
This is a mistake made in https://github.com/grpc/grpc/pull/33030.
`sizeof()` would count the null byte terminated the C string and would
cause us to skip a byte if it is used as the index to
`result->substr()`. This would also crash if `result` only contains
`grpc_config=` as @drfloob pointed out.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This metadata doesn't actually encode so passing it through from an app
will force a crash.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Instead just Utf-16 encode the null byte when dumping the value to a
string form.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Better logging for `assertRpcStatusCodes`.
(got tired of looking up the status names)
#### Unexpected status found
Before:
```
AssertionError: AssertionError: Expected only status 15 but found status 0 for method UNARY_CALL:
stats_per_method {
key: "UNARY_CALL"
value {
result {
key: 0
value: 251
}
}
}
```
After:
```
AssertionError: Expected only status (15, DATA_LOSS), but found status (0, OK) for method UNARY_CALL:
stats_per_method {
key: "UNARY_CALL"
value {
result {
key: 0
value: 251
}
}
}
```
#### No traffic with expected status
Before:
```
AssertionError: 0 not greater than 0
```
After:
```
AssertionError: 0 not greater than 0 : Expected non-zero RPCs with status (15, DATA_LOSS) for method UNARY_CALL, got:
stats_per_method {
key: "UNARY_CALL"
value {
result {
key: 0
value: 251
}
result {
key: 15
value: 0
}
}
}
```
#thistimeforsure
a863532c62 adds some debug to help track
which batches get leaked by a transport
3203e75ec5 makes connected_channel respect
the high level intent of cancellation better (and fixes the last reason
we needed to turn these tests off)
aaf5fa036b re-enables testing of c++ e2e
tests with server based promise calls
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
This will change behavior for tests that have experiments enabled on
them to always have the flaky bit on.
In doing so, we'll get the usual failure reporting we do in the internal
chat bot, but allow PRs to pass even if an experiment isn't 100% passing
yet - reducing friction slightly for landing bigger experiments.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Fixes `FakeXdsTransport` to remove itself from the map in
`FakeXdsTransportFactory` when it gets orphaned by the `XdsClient`, so
that a subsequent creation of a new transport for the same server does
not trigger an assertion due to the transport already existing in the
map.
Fixes internal b/259362837.
Was leading to a nullptr deref, and we just don't need this one anymore.
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Before this change, `Found subchannel in state READY` and `Channel to
xds:///psm-grpc-server:61404 transitioned to state ` would dump the full
channel/subchannel, in some implementations that expose
ChannelData.trace (f.e. go) would add 300 extra lines of log.
Now we print a brief repr-like chanel/subchannel info:
```
Found subchannel in state READY: <Subchannel subchannel_id=9 target=10.110.1.44:8080 state=READY>
Channel to xds:///psm-grpc-server:61404 transitioned to state READY: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=READY>
```
Also while waiting for the channel, we log channel_id now too:
```
Waiting to report a READY channel to xds:///psm-grpc-server:61404
Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=TRANSIENT_FAILURE>
Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=TRANSIENT_FAILURE>
Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=TRANSIENT_FAILURE>
Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=TRANSIENT_FAILURE>
Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=TRANSIENT_FAILURE>
Server channel: <Channel channel_id=2 target=xds:///psm-grpc-server:61404 state=READY>
```
With these, it is actually possible to have typed client stubs where the
return type is correctly inferred.
It's only for the non-streaming calls, because there is
`RequestIterableType` for the streaming ones (but it's just Any with
extra steps and would require much more work).
---------
Co-authored-by: Xuan Wang <xuanwn@google.com>
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
---------
Co-authored-by: Yash Tibrewal <yashkt@google.com>
Co-authored-by: Stanley Cheung <stanleycheung@google.com>
Co-authored-by: AJ Heller <hork@google.com>
Co-authored-by: Yijie Ma <yijiem.main@gmail.com>
Co-authored-by: apolcyn <apolcyn@google.com>
Co-authored-by: Jan Tattermusch <jtattermusch@google.com>
In collaboration with @Vignesh2208 . This supersedes
https://github.com/grpc/grpc/pull/32622. The original description is
below.
----
The current endpoint semantics are as follows:
On endpoint shutdown, the socket is immediately closed regardless and
any pending registered read/write closures are not immediately executed.
The pending read/write closures get executed with aborted error whenever
the next grpc_iocp_work method runs.
However the grpc_iocp_work may run only during grpc_shutdown and
grpc_shutdown may only get scheduled after the pending registered
read/write closures execute.
This PR changes the shutdown semantics to match shutdown semantics used
in posix - i.e On endpoint shutdown, the socket is immediately closed
and any pending registered read/write closures are executed immediately.
Additional care is taken to ensure that the socket is not immediately
deleted because the pending I/O ops still need to be flushed later
during grpc_shutdown.
---------
Co-authored-by: Vignesh Babu <vigneshbabu@google.com>
@sampajano This should fix b/265779666. If the CFEngine ends up taking
some time to land with rollbacks, bugs and whatnot, this can work in the
meantime.
Similar to what we already do in other test suites:
- Try cleaning up resources three times.
- If unsuccessful, don't fail the test and just log the error. The
cleanup script should be the one to deal with this.
ref b/282081851
Avoids long path name problems on Windows
<!--
If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.
If your pull request is for a specific language, please add the
appropriate
lang label.
-->
Fixes https://github.com/grpc/grpc/issues/32481.
Please test this with the (excellent) repro case in
https://github.com/grpc/grpc/pull/33000, and consider merging _just_ the
test from that PR.
Per #32481, the issue was bisected to
https://github.com/grpc/grpc/pull/30101. What changed in that PR is that
the epoll1 engine is only checked for availablily once per process at
iomgr initialization (which as a side effect initializes the engine),
but the engine was being shutdown with `grpc_shutdown` anyhow. With
repeated cycles of grpc init & shutdown in the same process, the second
attempt to reinit and use gRPC finds the epoll1 engine in an invalid
state.
Reverts grpc/grpc#32909
It's breaking some internal test
(//net/grpc/python:internal_tests/unit/_default_reflection_test), revert
for now to investigate.