DNS requests were previously cancellable, but it was assumed
that the resolution callback would be called in all cases. Now, requests
provide a `bool Cancel(TaskHandle handle)` method that lets callers know
whether cancellation is successful, and if so, the callback will not be
run.
This is in accordance with EventEngine's cancellation semantics, and it
is a step towards migration from iomgr to EventEngine.
* sketch
* Eliminate post-init in channel stack builder
We've had a post init function on channel stack builder for a very long
time, an it serves to run some code after initialization completes.
We need the functionality for a few things, but the function passed in
is intimately tied to the filter in use - we never vary it between
multiple functions for the same filter... which means it makes more
sense to locate this functionality as part of the filter interface.
* fix
* Automated change: Fix sanity tests
* introduce-channel-stack
* introduce-channel-stack
* Automated change: Fix sanity tests
* fix
* fix
* fix
* fixes
* Fix
* fix
* fix
* Automated change: Fix sanity tests
* Automated change: Fix sanity tests
* review feedback
* fix
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
* Move unnecessary absl includes from metadata_batch.h
metadata_batch.h is a widely used header, and so any includes it takes
that are for its internal use can negatively impact our build times.
* missing file
* build
* review feedback
* Eliminate post-init in channel stack builder
We've had a post init function on channel stack builder for a very long
time, an it serves to run some code after initialization completes.
We need the functionality for a few things, but the function passed in
is intimately tied to the filter in use - we never vary it between
multiple functions for the same filter... which means it makes more
sense to locate this functionality as part of the filter interface.
* fix
* Automated change: Fix sanity tests
* fix
Co-authored-by: ctiller <ctiller@users.noreply.github.com>
When single closure is scheduled multiple times before it is run, it
runs less times than the number of times it is scheduled.
As a result, when server receives multiple RPC calls in a very short
time frame, `accept_stream_locked` is not correctly called for every
RPC call received.
We are working on internal server side stress test to make sure this
kind of error won't happen again.
This commit also uses atomic int for issuing new stream id. Without this
there is 10% probability of stream id collision when 128 parallel RPC
calls are initiated at the same time.
Currently the tcp connect is performed in chttp2_connector before the handshaking is triggered. For
use cases where the application wants to perform business logic before
the tcp connection, this is problematic. By moving the TCP connect into
its own handshaker and registering it by default at the beginning, this
allows applications to add a new handshaker at the beginning allowing
handshaker logic before a TCP connect.
This approach has the advantage of slightly simplifying the logic in
tcp_connect_handshaker and httpcli as tcp_connect/callback can be
removed.
As the TCP connect needs parameters like resolved_addr,
interested_parties, a new struct called connection args is created as a
member for Handshaker Args.
For server handshakers most of the arguments here are not directly
useful, other than the deadline.
This file gets included by resolver, which gets included by
core_configuration, which in turn sprays this inclusion everywhere (and
it's a costly file to parse)
* adding a min progress size argument to grpc_endpoint_read
* fix missing argument error
* adding a static_cast
* reverting changes in tcp_posix.cc
* add missing changes to CFStreamEndpointTests.mm
* subchannel: report IDLE upon existing connection failure and after backoff interval
* rename AttemptToConnect() to RequestConnection()
* clang-format
* fix unused parameter warning
* fix subchannel to handle either TF or SHUTDOWN from transport
* fix handling of ConnectedSubchannel failure
* pass status up in IDLE state to communicate keepalive info
* update comment
* split pick_first and round_robin tests into their own test suites
* improve log message
* add test
* clang-format
* appease clang-tidy
* fix test to do a poor man's graceful shutdown to avoid spurious RPC failures
* simplify round_robin logic and fix test flakes
* fix grpclb bug
* Modifying slice buffer add to merge two contiguous slices sharing the same refcount object
* sanity checks
* updating test
* fixing grpc_slice_buffer_add API misuse in proto utils
* fix sanity checks
* minor fix
* WIP: add OOB backend metric API for LB policies
* fix some includes
* minor fixes
* picking this up again...
* more WIP
* health checking: cancel stream if response message fails to parse
* basic structure in place, but still have synchronization issues to address
* ORCA: implement ORCA RPC service for OOB backend metric reporting
* fix unused parameter error
* gen_upb_api
* add missing build deps
* increase test timing fudge factor
* add missing copyright header
* fix build and locking problems
* clang-format
* document API
* buildifier
* add test, but doesn't build yet
* new test working, but broke existing test, and need to fix server API
* don't register as a generic service
* update test for new orca service registration API
* fix build
* sanitize
* report interval defaults to min interval
* add channel trace event on UNIMPLEMENTED
* don't regenerate the response proto unless something changed
* add missing build dep
* fix comment
Currently when an embedded null is present, it is left as is. This causes an issue when grpc_sockaddr_to_uri is followed by any c style operations like copy, as the string is truncated at the non-encoded null character. For example this is triggered when channel args containing a string channel arg is copied.
To prevent this properly encode the URI with %00.
* move some code around
* remove num_backends parameter from XdsEnd2endTest
* remove use_xds_enabled_server param from XdsEnd2endTest
* remove xds_resource_does_not_exist_timeout_ms param from XdsEnd2endTest
* remove client_load_reporting_interval_seconds param from XdsEnd2endTest
* start moving CreateAndStartBackends() into individual tests
* finish moving CreateAndStartBackends() into individual tests
* remove unused variable
* remove SetEdsResourceWithDelay
* fix test flake
* clang-tidy
* clang-format
* move test framework to its own library
* fix build
* clang-format
* fix windows build
* rename TestType to XdsTestType
* move BackendServiceImpl inside of BackendServerThread
* clang-format
* move AdminServerThread to CSDS test suite
* remove unnecessary deps
* move aggregate and logical_dns cluster tests to their own file
* split aggregate and logical_dns tests into separate suites
* clang-format
* re-add flaky tag
* clang-tidy and remove unnecessary dep
* move some code around
* remove num_backends parameter from XdsEnd2endTest
* remove use_xds_enabled_server param from XdsEnd2endTest
* remove xds_resource_does_not_exist_timeout_ms param from XdsEnd2endTest
* remove client_load_reporting_interval_seconds param from XdsEnd2endTest
* start moving CreateAndStartBackends() into individual tests
* finish moving CreateAndStartBackends() into individual tests
* remove unused variable
* remove SetEdsResourceWithDelay
* fix test flake
* clang-tidy
* clang-format
* move test framework to its own library
* fix build
* clang-format
* fix windows build
* rename TestType to XdsTestType
* move BackendServiceImpl inside of BackendServerThread
* clang-format
* move AdminServerThread to CSDS test suite
* move ring_hash tests to their own file
* generate_projects
* remove unnecessary deps
* re-add flaky tag
* clang-format
* Support unix socket in grpc_sockaddr_to_string
* make it return statusor
* clang fix
* made grpc_sockaddr_to_string() to return statusor
* Let Chttp2ServerListener::Start crash
* test failure fixed
* api_fuzzer fixed
* comments addressed.
* more comments addressed
* comments addressed
* fix other broken builds
* refactor connection delay injection from client_lb_end2end_test
* fix build
* fix build on older compilers
* clang-format
* buildifier
* a bit of code cleanup
* start failover time whenever the child reports CONNECTING, and don't cancel when deactivating
* clang-format
* rewrite test
* simplify logic in priority policy
* clang-format
* switch to using a bit to indicate child healthiness
* fix reversed comment
* more changes in priority and ring_hash.
priority:
- go back to starting failover timer upon CONNECTING, but only if seen
READY or IDLE more recently than TRANSIENT_FAILURE
ring_hash:
- don't flap back and forth between IDLE and CONNECTING; once we go
CONNECTING, we stay there until either TF or READY
- after the first subchannel goes TF, we proactively start another
subchannel connecting, just like we do after a second subchannel
reports TF, to ensure that we don't stay in CONNECTING indefinitely if
we aren't getting any new picks
- always return ring hash's picker, regardless of connectivity state
- update the subchannel connectivity state seen by the picker upon
subchannel list creation
- start proactive subchannel connection attempt upon subchannel list
creation if needed
* ring_hash: fix connectivity state seen by aggregation and picker
* fix obiwan error
* swap the order of ring_hash aggregation rules 3 and 4
* restore original test
* refactor connection injector QueuedAttempt code
* add test showing that ring_hash will continue connecting without picks
* clang-format
* don't actually need seen_failure_since_ready_ anymore
* fix TSAN problem
* address code review comments
* move some code around
* remove num_backends parameter from XdsEnd2endTest
* remove use_xds_enabled_server param from XdsEnd2endTest
* remove xds_resource_does_not_exist_timeout_ms param from XdsEnd2endTest
* remove client_load_reporting_interval_seconds param from XdsEnd2endTest
* start moving CreateAndStartBackends() into individual tests
* finish moving CreateAndStartBackends() into individual tests
* remove unused variable
* remove SetEdsResourceWithDelay
* fix test flake
* clang-tidy
* clang-format
* move test framework to its own library
* fix build
* clang-format
* fix windows build
* rename TestType to XdsTestType
* move BackendServiceImpl inside of BackendServerThread
* clang-format
* move AdminServerThread to CSDS test suite
* move RLS tests to their own file
* remove unnecessary deps
* generate_projects
* Fixes a flake with the LoadReporter end2end test.
I *believe* the test is wrong, based on the .proto description of the
LoadReporter.
The protocol described in src/proto/grpc/lb/v1/load_reporter.proto has
the ReportLoad rpc returns a stream of LoadReportResponse, which itself
has a repeated field of Load messages. The comment before it states:
"It is not strictly necessary to aggregate all entries into one entry
per <tag, user_id> tuple, although it is preferred to do so."
Debugging the issue shows we are in fact properly getting all 3 expected
load report types, just in two separate messages instead of a single
one.
This new test codepath will coalesce the load report responses, and also
addresses the fact the original test wasn't verifying that we were
getting the 3 expected types.
* Automated change: Fix sanity tests
* Renaming variables.
* ASSERT_ -> EXPECT_
* Automated change: Fix sanity tests