* Add templates for PSM test.
The commit adds psm related template generated from
loadtest_template.py. This commit also update the
loadtest_example.sh to generate examples based on the templates.
* WIP: add OOB backend metric API for LB policies
* fix some includes
* minor fixes
* picking this up again...
* more WIP
* health checking: cancel stream if response message fails to parse
* basic structure in place, but still have synchronization issues to address
* ORCA: implement ORCA RPC service for OOB backend metric reporting
* fix unused parameter error
* gen_upb_api
* add missing build deps
* increase test timing fudge factor
* add missing copyright header
* fix build and locking problems
* clang-format
* document API
* buildifier
* add test, but doesn't build yet
* new test working, but broke existing test, and need to fix server API
* don't register as a generic service
* update test for new orca service registration API
* fix build
* sanitize
* report interval defaults to min interval
* add channel trace event on UNIMPLEMENTED
* don't regenerate the response proto unless something changed
* add missing build dep
* fix comment
This commit adds the PSM base scenario to each language in
scenario_config.py allowing per language generation of PSM
base scenarios. The PSM base scenarios will be further updated
in loadtest_config.py for number of client channels and the
number of async_server_thread. After update of the two
aforementioned fields, scenarios varying offered_load will
be generated from the base scenarios.
loadtest_config.py is updated to take number of client channels,
the number of async_servers and a list of targeted offered_load.
Fixes b/229679479, which has a race between timer starting and the
channel stack finishing initialization by delaying the timer start until
after the channel stack has been initialized by moving the
increment/decrement into the already existing startup closure on an exec
ctx.
* Initial unit testbench
* Tentative fix
* Reduce scope of fix. Clean up test
* Formatting
* Protect writing to RPC status store
* Don't start client until server is running
* Typo
* Remove redundant log
* More formatting
* Review comments
* Remove unmatched grpc_init from *ChannelFromFd methods
These unmatched grpc_init calls were preventing grpc_shutdown from
completing, which notably showed up in the only tests we have that
exercise these methods: `client_interceptors_end2end_tests.cc`.
These inits appear to be unnecessary, so I removed them. But if I've
missed some reason they're needed, an alternative is to find the right
place to add the corresponding `init_lib.shutdown();` call. I'm not sure
where that would be at this point.
See b/175634383
* Alternative: init & shutdown before calling any core library
* Add a raise for generate_projects if Python is too old
* Update the comment
* Resolve the YAPF difference between local 0.30 and kokoro's 0.30
* YAPF
Currently when an embedded null is present, it is left as is. This causes an issue when grpc_sockaddr_to_uri is followed by any c style operations like copy, as the string is truncated at the non-encoded null character. For example this is triggered when channel args containing a string channel arg is copied.
To prevent this properly encode the URI with %00.
* move some code around
* remove num_backends parameter from XdsEnd2endTest
* remove use_xds_enabled_server param from XdsEnd2endTest
* remove xds_resource_does_not_exist_timeout_ms param from XdsEnd2endTest
* remove client_load_reporting_interval_seconds param from XdsEnd2endTest
* start moving CreateAndStartBackends() into individual tests
* finish moving CreateAndStartBackends() into individual tests
* remove unused variable
* remove SetEdsResourceWithDelay
* fix test flake
* clang-tidy
* clang-format
* move test framework to its own library
* fix build
* clang-format
* fix windows build
* rename TestType to XdsTestType
* move BackendServiceImpl inside of BackendServerThread
* clang-format
* move AdminServerThread to CSDS test suite
* remove unnecessary deps
* move aggregate and logical_dns cluster tests to their own file
* split aggregate and logical_dns tests into separate suites
* clang-format
* re-add flaky tag
* clang-tidy and remove unnecessary dep
* move some code around
* remove num_backends parameter from XdsEnd2endTest
* remove use_xds_enabled_server param from XdsEnd2endTest
* remove xds_resource_does_not_exist_timeout_ms param from XdsEnd2endTest
* remove client_load_reporting_interval_seconds param from XdsEnd2endTest
* start moving CreateAndStartBackends() into individual tests
* finish moving CreateAndStartBackends() into individual tests
* remove unused variable
* remove SetEdsResourceWithDelay
* fix test flake
* clang-tidy
* clang-format
* move test framework to its own library
* fix build
* clang-format
* fix windows build
* rename TestType to XdsTestType
* move BackendServiceImpl inside of BackendServerThread
* clang-format
* move AdminServerThread to CSDS test suite
* move ring_hash tests to their own file
* generate_projects
* remove unnecessary deps
* re-add flaky tag
* clang-format
* Support unix socket in grpc_sockaddr_to_string
* make it return statusor
* clang fix
* made grpc_sockaddr_to_string() to return statusor
* Let Chttp2ServerListener::Start crash
* test failure fixed
* api_fuzzer fixed
* comments addressed.
* more comments addressed
* comments addressed
* fix other broken builds
* refactor connection delay injection from client_lb_end2end_test
* fix build
* fix build on older compilers
* clang-format
* buildifier
* a bit of code cleanup
* start failover time whenever the child reports CONNECTING, and don't cancel when deactivating
* clang-format
* rewrite test
* simplify logic in priority policy
* clang-format
* switch to using a bit to indicate child healthiness
* fix reversed comment
* more changes in priority and ring_hash.
priority:
- go back to starting failover timer upon CONNECTING, but only if seen
READY or IDLE more recently than TRANSIENT_FAILURE
ring_hash:
- don't flap back and forth between IDLE and CONNECTING; once we go
CONNECTING, we stay there until either TF or READY
- after the first subchannel goes TF, we proactively start another
subchannel connecting, just like we do after a second subchannel
reports TF, to ensure that we don't stay in CONNECTING indefinitely if
we aren't getting any new picks
- always return ring hash's picker, regardless of connectivity state
- update the subchannel connectivity state seen by the picker upon
subchannel list creation
- start proactive subchannel connection attempt upon subchannel list
creation if needed
* ring_hash: fix connectivity state seen by aggregation and picker
* fix obiwan error
* swap the order of ring_hash aggregation rules 3 and 4
* restore original test
* refactor connection injector QueuedAttempt code
* add test showing that ring_hash will continue connecting without picks
* clang-format
* don't actually need seen_failure_since_ready_ anymore
* fix TSAN problem
* address code review comments
* move some code around
* remove num_backends parameter from XdsEnd2endTest
* remove use_xds_enabled_server param from XdsEnd2endTest
* remove xds_resource_does_not_exist_timeout_ms param from XdsEnd2endTest
* remove client_load_reporting_interval_seconds param from XdsEnd2endTest
* start moving CreateAndStartBackends() into individual tests
* finish moving CreateAndStartBackends() into individual tests
* remove unused variable
* remove SetEdsResourceWithDelay
* fix test flake
* clang-tidy
* clang-format
* move test framework to its own library
* fix build
* clang-format
* fix windows build
* rename TestType to XdsTestType
* move BackendServiceImpl inside of BackendServerThread
* clang-format
* move AdminServerThread to CSDS test suite
* move RLS tests to their own file
* remove unnecessary deps
* generate_projects
* Fixes a flake with the LoadReporter end2end test.
I *believe* the test is wrong, based on the .proto description of the
LoadReporter.
The protocol described in src/proto/grpc/lb/v1/load_reporter.proto has
the ReportLoad rpc returns a stream of LoadReportResponse, which itself
has a repeated field of Load messages. The comment before it states:
"It is not strictly necessary to aggregate all entries into one entry
per <tag, user_id> tuple, although it is preferred to do so."
Debugging the issue shows we are in fact properly getting all 3 expected
load report types, just in two separate messages instead of a single
one.
This new test codepath will coalesce the load report responses, and also
addresses the fact the original test wasn't verifying that we were
getting the 3 expected types.
* Automated change: Fix sanity tests
* Renaming variables.
* ASSERT_ -> EXPECT_
* Automated change: Fix sanity tests
* move some code around
* remove num_backends parameter from XdsEnd2endTest
* remove use_xds_enabled_server param from XdsEnd2endTest
* remove xds_resource_does_not_exist_timeout_ms param from XdsEnd2endTest
* remove client_load_reporting_interval_seconds param from XdsEnd2endTest
* start moving CreateAndStartBackends() into individual tests
* finish moving CreateAndStartBackends() into individual tests
* remove unused variable
* remove SetEdsResourceWithDelay
* fix test flake
* clang-tidy
* clang-format
* move test framework to its own library
* fix build
* clang-format
* fix windows build
* move fault injection tests to their own file
* rename TestType to XdsTestType
* move BackendServiceImpl inside of BackendServerThread
* clang-format
* generate_projects
* appease clang-tidy
* move AdminServerThread to CSDS test suite
* remove unnecessary deps
* generate_projects
* don't mark test as flaky