Chiebot-Mirror/grpc - grpc - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Mark D. Roth	8a000f45f8	[grpclb and fake resolver] clean up e2e tests and simplify fake resolver (#34887 ) Changes to fake resolver: - Add `WaitForReresolutionRequest()` method to fake resolver response generator to allow tests to tell when re-resolution has been requested. - Change fake resolver response generator API to have only one mechanism for injecting results, regardless of whether the result is an error or whether it's triggered by a re-resolution. Changes to grpclb_end2end_test: - Change balancer interface such that instead of setting a list of responses with fixed delays, the test can control exactly when each response is set. - Change balancer impl to always send the initial LB response, as expected by the grpclb protocol. - Change balancer impl to always read load reports, even if load reporting is not expected to be enabled. (The latter case will still cause the test to fail.) Reads are done in a different thread than writes. - Allow each test to directly control how many backends and balancers are started and the client load reporting interval, so that (a) we don't waste resources starting servers we don't need and (b) there is no need to arbitrarily split tests across different test classes. - Add timeouts to `WaitForAllBackends()` functionality, so that tests will fail with a useful error rather than timing out. - Improved ergonomics of various helper functions in the test framework. In the process of making these changes, I found a couple of bugs: - A bug in pick_first, which I fixed in #34885. - A bug in grpclb, in which we were using the wrong condition to decide whether to propagate a re-resolution request from the child policy, which I've fixed in this PR. (This bug probably originated way back in #18344.) This should address a lot of the flakes seen in grpclb_e2e_test recently.	1 year ago
Mark D. Roth	30a24a37d1	[e2e tests] refactor code for determining local IP address (#34769 )	1 year ago
Vignesh Babu	299b4fe3fd	[flakiness] Fix regex comparison which causes client_lb test to flake at a high rate internally when event engine is enabled. (#34607 )	1 year ago
Mark D. Roth	883ec58237	[client_lb_e2e_test] fix flake in RR HealthChecking test (#34572 )	1 year ago
Mark D. Roth	7a06614f95	[resolver and LB policy APIs] reland: change address list to support multiple addresses per endpoint (#34531 ) Re-land #33567, which was reverted in #34527. First commit is a pure revert, second commit is a small fix needed to avoid breaking internal callers.	1 year ago
Mark D. Roth	41f26de3b6	Revert "[resolver and LB policy APIs] change address list to support multiple addresses per endpoint" (#34527 ) Reverts grpc/grpc#33567 due to import problems.	1 year ago
Mark D. Roth	fd2e8c9462	[resolver and LB policy APIs] change address list to support multiple addresses per endpoint (#33567 ) More changes as part of the dualstack design: - Change resolver and LB policy APIs to support multiple addresses per endpoint. Specifically, replace `ServerAddress` with `EndpointAddresses`, which encodes more than one address. Per-address channel args are retained at the same level, so they are now per-endpoint. For now, `EndpointAddress` provides a single-address ctor and a single-address accessor for backward compatibility, so `ServerAdress` is an alias for `EndpointAddresses`; eventually, this alias and the single-address methods will be removed. - Add an `EndpointAddressSet` class, which represents an unordered set of addresses to be used as a map key. This will be used in a number of LB policies that need to store per-endpoint state. - Change the LB policy API's `ChannelControlHelper::CreateSubchannel()` method to take the address and per-endpoint channel args as separate parameters, so that we don't need to construct a legacy `ServerAddress` object as we create a new subchannel for each address in the endpoint. - Change pick_first to flatten the address list. - Change ring_hash to use `EndpointAddressSet` as the key for its endpoint map, and to use the first address of the endpoint as the hash key. - Change WRR to use `EndpointAddressSet` as the key for its endpoint weight map. Note that support for multiple addresses per endpoint is guarded in RR by the existing `round_robin_delegate_to_pick_fist` experiment and in WRR by the existing `wrr_delegate_to_pick_first` experiment. This PR does not include support for multiple addresses per endpoint for the outlier_detection or xds_override_host LB policies; those will come in subsequent PRs.	1 year ago
Mark D. Roth	835775e347	[pick_first] implement Happy Eyeballs (#34426 )	1 year ago
Craig Tiller	0375a585e2	[work-serializer] Fix synchronous test assumption (#34392 ) This test assumed synchronous work serializer execution (or at least faster async than we always get)... make a trivial change to keep the test semantics but allow for the implementation to be more async.	1 year ago
Craig Tiller	86b931c354	[work-serializer] Dispatch on run experiment (relanding) (#34372 ) Reverts grpc/grpc#34371	1 year ago
Craig Tiller	d589caa679	Revert "[work-serializer] Dispatch on run experiment" (#34371 ) Reverts grpc/grpc#34274 (needs some changes internally)	1 year ago
Craig Tiller	1705470950	[work-serializer] Dispatch on run experiment (#34274 ) Co-authored-by: ctiller <ctiller@users.noreply.github.com> Co-authored-by: Mark D. Roth <roth@google.com>	1 year ago
Mark D. Roth	1986007e1e	[round_robin] 4th attempt: delegate to pick_first as per dualstack design (#34337 ) Most recent attempt was #34320, reverted in #34335. The first commit here is a pure revert. The second commit fixes the outlier_detection unit test to pass both with and without the experiment.	1 year ago
Mark D. Roth	6534f0a6bf	Revert "[round_robin] third attempt: delegate to pick_first as per dualstack design" (#34335 ) Reverts grpc/grpc#34320	1 year ago
Mark D. Roth	d713427cec	[round_robin] third attempt: delegate to pick_first as per dualstack design (#34320 ) Previous attempt was #34241, reverted in #34317. The second commit here makes the experiment disablable, so that we can roll it out slowly internally.	1 year ago
Craig Tiller	e6bf7c12cf	Revert "[round_robin] delegate to pick_first as per dualstack design" (#34317 ) Reverts grpc/grpc#34241	1 year ago
Mark D. Roth	97571ebf81	[round_robin] delegate to pick_first as per dualstack design (#34241 ) Rolls forward the remaining changes from #32692, which were rolled back in #33718.	1 year ago
Mark D. Roth	b7e680ad46	[health checking] move to generic health watch for dualstack design (#34222 ) Rolls forward part of the dualstack changes, mostly from #33427 and a little bit from #32692, both of which were reverted in #33718. Specifically: - For petiole policies, unconditionally start health watch on subchannels, even if client side health checking is not enabled; in this case, the health watch will report the subchannel's raw connectivity state. - Fix edge cases in health check reporting that occur when a watcher is started before the initial state is reported. - When client-side health checking fails, add the subchannel's address to the RPC failure status message. - Outlier detection now works only via the health checking watch, not via the raw connectivity state watch. - Remove now-unnecessary hack to ensure that outlier detection does not work for pick_first.	1 year ago
Mark D. Roth	b980f62ca6	[pick_first] adjust threshold on e2e test to address flake (#34157 )	1 year ago
Mark D. Roth	72e791402f	[pick_first] fix test flake (#34098 ) CNR the flake, but I've changed the test (which is very old) to use some of our more modern helper functions that have saner timeouts. Also re-add a `return` statement that was accidentally removed in #33753, which I noticed while working on this. Its absence doesn't cause a real problem, but it does cause us to needlessly trigger a duplicate connection attempt or report a duplicate CONNECTING update in some cases.	1 year ago
Mark D. Roth	64a318acd4	[pick_first] fix sticky-TF and handling of subchannels in TRANSIENT_FAILURE (#33753 ) Fix sticky-TF behavior such that once we enter TRANSIENT_FAILURE, we do not leave that state if we get a new address list. Also, fix handling of subchannels in state TRANSIENT_FAILURE. Previously, if a subchannel was already in state TRANSIENT_FAILURE when we wanted to start a connection attempt on it (e.g., because the subchannel already existed from a different channel, or because it already existed in the previous subchannel list), we would wait for it to report IDLE before attempting to connect. This PR changes pick_first to instead immediately skip the subchannel and move on to the next one. Now, the only time we wait for a subchannel in TRANSIENT_FAILURE is when we wrap back around to the first subchannel in the list.	1 year ago
Craig Tiller	91e7f223d3	[server] Remove `Notification` from shutdown path (#33953 ) I'm fairly certain that this path should be non-blocking (and making it so makes the promise based code far more tractable). This moves the blocking behavior into the blocking server_cc.cc function that calls `grpc_server_shutdown_and_notify` instead of in that non-blocking function. --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago
Mark D. Roth	083bbee480	[LB policies] revert changes for dualstack design (#33718 ) This reverts the following PRs: #32692 #33087 #33093 #33427 #33568 These changes seem to have introduced some flaky crashes. Reverting while I investigate.	1 year ago
Mark D. Roth	ec39600872	[WRR] fix bugs that caused us to re-enter blackout period upon updates (#33694 ) As per gRFC A58, when WRR sees a subchannel report READY, it reset the non_empty_since value, thus restarting the blackout period. However, there were two cases where we were incorrectly triggering this code: 1. When WRR got an updated address list that contained addresses that were already present on the old list and whose subchannels were already in READY state, the initial notification for those subchannels on the new list was READY, which incorrectly triggered resetting the non_empty_since value. 2. Due to a bug in the outlier_detection policy, whenever an update was propagated down through the OD policy without actually enabling OD, it would incorrectly send a duplicate connectivity state notification for the subchannels. This meant that a subchannel that was already in state READY would report READY again, which would also incorrectly trigger resetting the non_empty_since value. This PR makes two changes: 1. Fix the bug in outlier_detection that caused it to generate the spurious duplicate READY updates. 2. Fix WRR to reset the non_empty_since value when a subchannel goes READY only if the subchannel has seen a previous state update and only if that previous state was not READY. (The duplicate READY notifications should not actually happen anymore now that the OD policy has been fixed, but better to be defensive.) Fixes b/290983884.	1 year ago
Mark D. Roth	51e54ed636	[outlier detection] remove support for ejection via raw connectivity state (#33427 ) More work on the dualstack backend design: - Now that all petiole policies have been changed to delegate to pick_first, outlier detection no longer needs to eject via the subchannel's raw connectivity state; it can now eject only via the health state. See #33340. - This also removes the now-unnecessary hack to explicitly disable outlier detection in pick_first. See #33336.	1 year ago
Mark D. Roth	27a778fece	[round robin] delegate to pick_first instead of creating subchannels directly (#32692 ) More work on the dualstack backend design: - Change round_robin to delegate to pick_first instead of creating subchannels directly. - Change pick_first such that when it is the child of a petiole policy, it will unconditionally start a health watch. - Change the client-side health checking code such that if client-side health checking is not enabled, it will return the subchannel's raw connectivity state. - As part of this, we introduce a new endpoint_list library to be used by petiole policies, which is intended to replace the existing subchannel_list library. The only policy that will still directly interact with subchannels is pick_first, so the relevant parts of the subchannel_list functionality have been copied directly into that policy. The subchannel_list library will be removed after all petiole policies are updated to delegate to pick_first.	1 year ago
Mark D. Roth	8427bacaea	[resolver API] remove address attribute interface (#33514 ) The address attribute interface was intended to provide a mechanism to pass attributes separately from channel args, for values that do not affect subchannel behavior and therefore do not need to be present in the subchannel key, which does include channel args. However, the mechanism as currently designed is fairly clunky and is probably not the direction we will want to go in the long term. Eventually, we will want some mechanism for registering channel args, which would provide a cleaner way to indicate that a given channel arg should not be used in the subchannel key, so that we don't need a completely different mechanism. For now, this PR is just doing an interim step, which is to establish a special channel arg key prefix to indicate that an arg is not needed in the subchannel key.	1 year ago
Yousuk Seung	c03cd744b2	[WRR] Prefer application_utilization to cpu_utilization (#33355 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	2 years ago
Mark D. Roth	6b4a1e4243	[outlier detection] hack to prevent OD from working with pick_first (#33336 ) As per discussion in #32967.	2 years ago
Mark D. Roth	a78001a087	[resolver] remove unused ctor for ServerAddress (#33148 ) Co-authored-by: markdroth <markdroth@users.noreply.github.com>	2 years ago
Mark D. Roth	1fcaccdf5f	[client channel] Second attempt: use ChunkedVector for call attributes (#33015 ) Original was #33002, reverted in #33014. The second commit here adds a build visibility tag necessary to fix the internal build problems.	2 years ago
AJ Heller	18aab6ffb5	Revert "[client channel] use ChunkedVector for call attributes" (#33014 ) Reverts grpc/grpc#33002. Breaks internal builds: `.../privacy_context:filters does not depend on a module exporting '.../src/core/lib/channel/context.h'`	2 years ago
Mark D. Roth	2f89fd5528	[client channel] use ChunkedVector for call attributes (#33002 ) Change call attributes to be stored in a `ChunkedVector` instead of `std::map<>`, so that the storage can be allocated on the arena. This means that we're now doing a linear search instead of a map lookup, but the total number of attributes is expected to be low enough that that should be okay. Also, we now hide the actual data structure inside of the `ServiceConfigCallData` object, which required some changes to the `ConfigSelector` API. Previously, the `ConfigSelector` would return a `CallConfig` struct, and the client channel would then use the data in that struct to populate the `ServiceConfigCallData`. This PR changes that such that the client channel creates the `ServiceConfigCallData` before invoking the `ConfigSelector`, and it passes the `ServiceConfigCallData` into the `ConfigSelector` so that the `ConfigSelector` can populate it directly.	2 years ago
Yousuk Seung	8b02295e58	[xDS] Accept cpu_utilization over 100% (#32954 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	2 years ago
Mark D. Roth	020e9b4dd6	[WRR] Remove env var guard for WRR policy (#32936 ) - remove the `_experimental` suffix from the gRPC policy name - remove the env var guard for the xDS policy config	2 years ago
Yousuk Seung	c02b3e695c	xDS: Include orca named_metrics in LRS load reports (#32690 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	2 years ago
Craig Tiller	175ccc3a90	Reland global config changes (#32661 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. --> --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	2 years ago
Yousuk Seung	16c03db9ac	Revert "Revert "WRR: Support EPS" (#32723 )" (#32725 ) This reverts commit `7bd9267f32`. <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	2 years ago
Esun Kim	7bd9267f32	Revert "WRR: Support EPS" (#32723 ) Reverts grpc/grpc#32657	2 years ago
Yousuk Seung	4429066516	WRR: Support EPS (#32657 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	2 years ago
Jan Tattermusch	0c1797cd9f	Revert "[config] Move global config alongside core configuration" (#32659 ) Reverts grpc/grpc#30788 (it breaks grpc_objc_bazel_test (see https://github.com/grpc/grpc/pull/30788#issuecomment-1476372187) and also seems to be breaking some other internal stuff).	2 years ago
Craig Tiller	b7a83305e6	[config] Move global config alongside core configuration (#30788 ) This is a big rewrite of global config. It does a few things, all somewhat intertwined: 1. centralize the list of configuration we have to a yaml file that can be parsed, and code generated from it 2. add an initialization and a reset stage so that config vars can be centrally accessed very quickly without the need for caching them 3. makes the syntax more C++ like (less macros!) 4. (optionally) adds absl flags to the OSS build This first round of changes is intended to keep the system where it is without major changes. We pick up absl flags to match internal code and remove one point of deviation - but importantly continue to read from the environment variables. In doing so we don't force absl flags on our customers - it's possible to configure grpc without the flags - but instead allow users that do use absl flags to configure grpc using that mechanism. Importantly this lets internal customers configure grpc the same everywhere. Future changes along this path will be two-fold: 1. Move documentation generation into the code generation step, so that within the source of truth yaml file we can find all documentation and data about a configuration knob - eliminating the chance of forgetting to document something in all the right places. 2. Provide fuzzing over configurations. Currently most config variables get stashed in static constants across the codebase. To fuzz over these we'd need a way to reset those cached values between fuzzing rounds, something that is terrifically difficult right now, but with these changes should simply be a reset on `ConfigVars`. <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. --> --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	2 years ago
Yousuk Seung	3cc76171a9	Merge per-request and per-server named metrics field-wise (#32634 ) We currently take named metrics recorded per-request only. Instead we should merge field-wise. <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	2 years ago
Yijie Ma	2ce147a131	Add a test case to verify SubchannelStreamClient retry when Health.Watch ends (#31850 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	2 years ago
Mark D. Roth	9dd6a98ed6	e2e tests: update regex used for connection failure messages (#32325 ) * e2e tests: update regex used for connection failure messages * attempt to tweak regex syntax to work on mac	2 years ago
Yousuk Seung	b98f527260	Revert "Revert "Revert "Revert "server: introduce ServerMetricRecorde… (#32301 ) * Revert "Revert "Revert "Revert "server: introduce ServerMetricRecorder API and move per-call reporting from a C++ interceptor to a C-core filter (#32106)" (#32272)" (#32279)" (#32293)" This reverts commit `1f960697c5`. * Do not create CallMetricRecorder if call is null.	2 years ago
Craig Tiller	1f960697c5	Revert "Revert "Revert "server: introduce ServerMetricRecorder API and move per-call reporting from a C++ interceptor to a C-core filter (#32106 )" (#32272 )" (#32279 )" (#32293 ) This reverts commit `4475e74c6a`.	2 years ago
Yousuk Seung	4475e74c6a	Revert "Revert "server: introduce ServerMetricRecorder API and move per-call reporting from a C++ interceptor to a C-core filter (#32106 )" (#32272 )" (#32279 ) * Revert "Revert "server: introduce ServerMetricRecorder API and move per-call reporting from a C++ interceptor to a C-core filter (#32106)" (#32272)" This reverts commit `deb1e25543`. * Fix by caching call metric recording stuff in async request PR #32106 caused msan errors in some tests while de-referencing the server object where async calls are active after the server is destroyed. Instead cache the ServerMetricRecorder pointer. * copyright headers fixed * clang fixes.	2 years ago
Xuan Wang	deb1e25543	Revert "server: introduce ServerMetricRecorder API and move per-call reporting from a C++ interceptor to a C-core filter (#32106 )" (#32272 ) This reverts commit `c7f641da0d`.	2 years ago
Yousuk Seung	c7f641da0d	server: introduce ServerMetricRecorder API and move per-call reporting from a C++ interceptor to a C-core filter (#32106 ) * backend metric sampling * Comments addressed. * More comments addressed. * Pushing changes left behind locally. * Removed empty lines * Update OrcaService to use ServerMetricRecorder (no named metrics yet) * Comments addressed. * More comments addressed * More comments addressed. * Comments fixed * Comments addressed. * Test fixed * make seq returned always up-to-date * skip atomic load when not cached * Fixed ABSL_GUARDED_BY * Comments addressed except client_lb_end2end_test * test updated * Comments addressed * BUILD fix. * BackendMetricDataState moved to a separate header * comments addressed * Fixed clang and buildifier errors * More sanity check errors fixed. * Fixed xds tests * Ran generate_projects.sh * Comments addressed * comments addressed. * generate project * Build fixed * generate project * sanity check errors fixed * test fixed * Backup poller period override moved to main() * Also move cfstream override * Clang fixes, sanitize * generate_projects.sh * portable print format fix * Removed outdated comment	2 years ago

1 2 3 4 5 ...

288 Commits (90af0a115d64c95bb9e70c1296c4f308d9cbca52)