Chiebot-Mirror/grpc - grpc - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Craig Tiller	47306d78f4	[work-serializer] Add some basic process-wide monitoring (#34369 ) Add some basic metrics to work serializer, keep them process wide for now (though it may be interesting to get these into channelz in the future). Collected are: - time spent running a work serializer when it starts - time spent actually executing work when the work serializer runs - number of items executed each run A high disparity between the first two indicates our dispatching mechanism is adding large amounts of latency (perhaps due to thread starvation like effects). A high value for any of these indicate contention on the serializer. It's likely a future iteration on these will select different metrics - I'm not entirely sure which will be useful in production analysis yet. I'm using `std::chrono::steady_clock` here for precision (nanoseconds) with a compact representation (better than timespec) and a robust & portable api - I think it's appropriate for metrics, but wouldn't use it much beyond that at this point.	1 year ago
Mark D. Roth	25cb8e6ed2	[WRR] delegate to pick_first as per dualstack design (#34245 ) Rolls forward the changes from #33087, which were rolled back in #33718. This change is now guarded by a disablable experiment.	1 year ago
Craig Tiller	86b931c354	[work-serializer] Dispatch on run experiment (relanding) (#34372 ) Reverts grpc/grpc#34371	1 year ago
Craig Tiller	d589caa679	Revert "[work-serializer] Dispatch on run experiment" (#34371 ) Reverts grpc/grpc#34274 (needs some changes internally)	1 year ago
Craig Tiller	1705470950	[work-serializer] Dispatch on run experiment (#34274 ) Co-authored-by: ctiller <ctiller@users.noreply.github.com> Co-authored-by: Mark D. Roth <roth@google.com>	1 year ago
Mark D. Roth	1986007e1e	[round_robin] 4th attempt: delegate to pick_first as per dualstack design (#34337 ) Most recent attempt was #34320, reverted in #34335. The first commit here is a pure revert. The second commit fixes the outlier_detection unit test to pass both with and without the experiment.	1 year ago
Mark D. Roth	6534f0a6bf	Revert "[round_robin] third attempt: delegate to pick_first as per dualstack design" (#34335 ) Reverts grpc/grpc#34320	1 year ago
Mark D. Roth	d713427cec	[round_robin] third attempt: delegate to pick_first as per dualstack design (#34320 ) Previous attempt was #34241, reverted in #34317. The second commit here makes the experiment disablable, so that we can roll it out slowly internally.	1 year ago
Craig Tiller	e6bf7c12cf	Revert "[round_robin] delegate to pick_first as per dualstack design" (#34317 ) Reverts grpc/grpc#34241	1 year ago
Mark D. Roth	97571ebf81	[round_robin] delegate to pick_first as per dualstack design (#34241 ) Rolls forward the remaining changes from #32692, which were rolled back in #33718.	1 year ago
Eugene Ostroukhov	3824288bad	[Tests] Move the http_proxy_mapper_test.cc back (#34268 )	1 year ago
Eugene Ostroukhov	a5e9feeb04	[HTTP Proxy] Rename source/header and move test (#34221 )	1 year ago
Mark D. Roth	b7e680ad46	[health checking] move to generic health watch for dualstack design (#34222 ) Rolls forward part of the dualstack changes, mostly from #33427 and a little bit from #32692, both of which were reverted in #33718. Specifically: - For petiole policies, unconditionally start health watch on subchannels, even if client side health checking is not enabled; in this case, the health watch will report the subchannel's raw connectivity state. - Fix edge cases in health check reporting that occur when a watcher is started before the initial state is reported. - When client-side health checking fails, add the subchannel's address to the RPC failure status message. - Outlier detection now works only via the health checking watch, not via the raw connectivity state watch. - Remove now-unnecessary hack to ensure that outlier detection does not work for pick_first.	1 year ago
Mark D. Roth	b8fd38d7cb	[xds_override_host] improve logging for debuggability (#34223 ) I wound up needing this to debug some problems in the dualstack code.	1 year ago
Mark D. Roth	6412412ae1	[pick_first] changes to support dualstack design (#34218 ) This rolls forward only the pick_first changes from #32692, which were rolled back in #33718. Specifically: - Changes PF to use its own subchannel list implementation instead of using the subchannel_list library, since the latter will be going away with the dualstack changes. - As a result of no longer using the subchannel_list library, PF no longer needs to set the `GRPC_ARG_INHIBIT_HEALTH_CHECKING` channel arg. - Adds an option to start a health watch on the chosen subchannel, to be used in the future when pick_first is the child of a petiole policy. (Currently, this code is not actually called anywhere.)	1 year ago
Craig Tiller	b85b57fdc7	[wrr] Add metrics to help debug high WRR cost (#34095 ) WRR is showing a very high CPU cost relative to previous solutions, and it's unclear why this is. Add two metrics that should help us see the shape of the subchannel sets that are being passed to high cost systems in order to confirm/deny theories. --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago
Mohan Li	ab024624da	[pick_first] de-experiment pick first (#34054 ) De-experiment pick first since we have both affinity and randomness E2E test running successfully. --------- Co-authored-by: Yash Tibrewal <yashkt@google.com>	1 year ago
Mark D. Roth	64a318acd4	[pick_first] fix sticky-TF and handling of subchannels in TRANSIENT_FAILURE (#33753 ) Fix sticky-TF behavior such that once we enter TRANSIENT_FAILURE, we do not leave that state if we get a new address list. Also, fix handling of subchannels in state TRANSIENT_FAILURE. Previously, if a subchannel was already in state TRANSIENT_FAILURE when we wanted to start a connection attempt on it (e.g., because the subchannel already existed from a different channel, or because it already existed in the previous subchannel list), we would wait for it to report IDLE before attempting to connect. This PR changes pick_first to instead immediately skip the subchannel and move on to the next one. Now, the only time we wait for a subchannel in TRANSIENT_FAILURE is when we wrap back around to the first subchannel in the list.	1 year ago
Craig Tiller	3717ff04ba	[chttp2] Split ping policy from transport (#33703 ) Why: Cleanup for chttp2_transport ahead of promise conversion - lots of logic has become interleaved throughout chttp2, so some effort to isolate logic out is warranted ahead of that conversion. What: Split configuration and policy tracking for each of ping rate throttling and abuse detection into their own modules. Add tests for them. Incidentally: Split channel args into their own header so that we can split the policy stuff into separate build targets. --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago
Mark D. Roth	083bbee480	[LB policies] revert changes for dualstack design (#33718 ) This reverts the following PRs: #32692 #33087 #33093 #33427 #33568 These changes seem to have introduced some flaky crashes. Reverting while I investigate.	1 year ago
Mark D. Roth	ec39600872	[WRR] fix bugs that caused us to re-enter blackout period upon updates (#33694 ) As per gRFC A58, when WRR sees a subchannel report READY, it reset the non_empty_since value, thus restarting the blackout period. However, there were two cases where we were incorrectly triggering this code: 1. When WRR got an updated address list that contained addresses that were already present on the old list and whose subchannels were already in READY state, the initial notification for those subchannels on the new list was READY, which incorrectly triggered resetting the non_empty_since value. 2. Due to a bug in the outlier_detection policy, whenever an update was propagated down through the OD policy without actually enabling OD, it would incorrectly send a duplicate connectivity state notification for the subchannels. This meant that a subchannel that was already in state READY would report READY again, which would also incorrectly trigger resetting the non_empty_since value. This PR makes two changes: 1. Fix the bug in outlier_detection that caused it to generate the spurious duplicate READY updates. 2. Fix WRR to reset the non_empty_since value when a subchannel goes READY only if the subchannel has seen a previous state update and only if that previous state was not READY. (The duplicate READY notifications should not actually happen anymore now that the OD policy has been fixed, but better to be defensive.) Fixes b/290983884.	1 year ago
Mark D. Roth	38816cf327	[WRR] delegate to pick_first instead of creating subchannels directly (#33087 ) As part of the dualstack backend design, change WRR to delegate to pick_first instead of creating subchannels directly.	1 year ago
Mark D. Roth	27a778fece	[round robin] delegate to pick_first instead of creating subchannels directly (#32692 ) More work on the dualstack backend design: - Change round_robin to delegate to pick_first instead of creating subchannels directly. - Change pick_first such that when it is the child of a petiole policy, it will unconditionally start a health watch. - Change the client-side health checking code such that if client-side health checking is not enabled, it will return the subchannel's raw connectivity state. - As part of this, we introduce a new endpoint_list library to be used by petiole policies, which is intended to replace the existing subchannel_list library. The only policy that will still directly interact with subchannels is pick_first, so the relevant parts of the subchannel_list functionality have been copied directly into that policy. The subchannel_list library will be removed after all petiole policies are updated to delegate to pick_first.	1 year ago
Mark D. Roth	8427bacaea	[resolver API] remove address attribute interface (#33514 ) The address attribute interface was intended to provide a mechanism to pass attributes separately from channel args, for values that do not affect subchannel behavior and therefore do not need to be present in the subchannel key, which does include channel args. However, the mechanism as currently designed is fairly clunky and is probably not the direction we will want to go in the long term. Eventually, we will want some mechanism for registering channel args, which would provide a cleaner way to indicate that a given channel arg should not be used in the subchannel key, so that we don't need a completely different mechanism. For now, this PR is just doing an interim step, which is to establish a special channel arg key prefix to indicate that an arg is not needed in the subchannel key.	1 year ago
Eugene Ostroukhov	7bce35ed41	Revert "Revert "[lb pick_first] Enable random shuffling of address list" (#33497 ) Original: #33496 This reverts commit `d59c8eb0f5`.	1 year ago
Eugene Ostroukhov	d59c8eb0f5	Revert "[lb pick_first] Enable random shuffling of address list (#33254 )" (#33496 ) Original PR: 33254 This reverts commit `7e14a322a2`.	1 year ago
Eugene Ostroukhov	7e14a322a2	[lb pick_first] Enable random shuffling of address list (#33254 ) Implementation of [gRFC A62](https://github.com/grpc/proposal/blob/master/A62-pick-first.md)	1 year ago
Mark D. Roth	50e970246f	[LB policy API] add helper methods for getting channel creds (#33451 ) This addresses a long-standing TODO that we couldn't do prior to #25586. For details, see internal doc go/grpc-rls-callcreds-to-server.	2 years ago
Mark D. Roth	6a04e9c7e5	[subchannel interface] add method for cancelling data watches (#33359 ) This paves the way for cases where we handle health watches in wrapped subchannels, which we'll need as part of the dualstack backend design.	2 years ago
Yousuk Seung	c03cd744b2	[WRR] Prefer application_utilization to cpu_utilization (#33355 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	2 years ago
Mark D. Roth	6b4a1e4243	[outlier detection] hack to prevent OD from working with pick_first (#33336 ) As per discussion in #32967.	2 years ago
Mark D. Roth	77418492fd	[pick_first] add tests that show handling of multiple addresses (#33255 ) This should help you get an idea of how to write the tests for #33254.	2 years ago
Mark D. Roth	ba1b8b15ea	[JSON] fix bug that incorrectly allowed trailing commas after an empty container (#33158 )	2 years ago
Mark D. Roth	a78001a087	[resolver] remove unused ctor for ServerAddress (#33148 ) Co-authored-by: markdroth <markdroth@users.noreply.github.com>	2 years ago
Mark D. Roth	8fdfb22848	[JSON] generalize handling of RefCountedPtr<> (#33048 ) Also remove a check in the weighted_target LB policy that I somehow missed in #32932.	2 years ago
Mark D. Roth	2c423d277c	[outlier detection] fix crash with pick_first and add tests (#33069 ) Fixes #32967. Also fix incorrect defaults for `enforcementPercentage` fields.	2 years ago
Mark D. Roth	17315823c2	[client channel] assume LB policies start in CONNECTING state (#33009 ) Currently, we are not very consistent in what we assume the initial state of an LB policy will be and whether or not we assume that it will immediately report a new picker when it gets its initial address update; different parts of our code make different assumptions. This PR establishes the convention that LB policies will be assumed to start in state CONNECTING and will not be assumed to report a new picker immediately upon getting their initial address update, and we now assume that convention everywhere consistently. This is a preparatory step for changing policies like round_robin to delegate to pick_first, which I'm working on in #32692. As part of that change, we need pick_first to not report a connectivity state until it actually sees the connectivity state of the underlying subchannels, so that round_robin knows when to swap over to a new child list without reintroducing the problem fixed in #31939.	2 years ago
Mark D. Roth	1432fe4e4c	[JSON] make API public but experimental (#32987 ) This makes the JSON API visible as part of the C-core API, but in the `experimental` namespace. It will be used as part of various experimental APIs that we will be introducing in the near future, such as the audit logging API.	2 years ago
Mark D. Roth	e872fb91d9	[WRR] fix some edge cases in scheduler logic (#33045 ) This corresponds to two recent changes made to our internal implementation. See b/276292666 for details.	2 years ago
Mark D. Roth	844e740183	[JSON] Replace ctors with factory methods (#32834 )	2 years ago
Eugene Ostroukhov	ac228814a0	[core] Expand core attributes to hold values of any type (#32835 )	2 years ago
Mark D. Roth	020e9b4dd6	[WRR] Remove env var guard for WRR policy (#32936 ) - remove the `_experimental` suffix from the gRPC policy name - remove the env var guard for the xDS policy config	2 years ago
Mark D. Roth	9393cd887c	[JSON] remove mutable accessor methods. (#32806 ) Co-authored-by: markdroth <markdroth@users.noreply.github.com>	2 years ago
Stan Hu	4110dea333	[HTTP Proxy] Support CIDR blocks in `no_proxy` config (#31119 ) This commit adds support for using CIDR blocks defined in the `no_proxy` environment variable. For example: ``` http_proxy=http://localhost:8080 no_proxy=10.10.0.0/24 ``` The example above would bypass the proxy if the server IP matched 10.10.0.0 - 10.10.0.255. Closes #22681 --------- Co-authored-by: Yash Tibrewal <yashkt@google.com>	2 years ago
Mark D. Roth	36d2716d52	[JSON] move Parse() and Dump() methods out of JSON object (#32742 ) More prep for making this a public API.	2 years ago
Craig Tiller	175ccc3a90	Reland global config changes (#32661 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. --> --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	2 years ago
Yousuk Seung	16c03db9ac	Revert "Revert "WRR: Support EPS" (#32723 )" (#32725 ) This reverts commit `7bd9267f32`. <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	2 years ago
Esun Kim	7bd9267f32	Revert "WRR: Support EPS" (#32723 ) Reverts grpc/grpc#32657	2 years ago
Yousuk Seung	4429066516	WRR: Support EPS (#32657 ) <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	2 years ago
Jan Tattermusch	0c1797cd9f	Revert "[config] Move global config alongside core configuration" (#32659 ) Reverts grpc/grpc#30788 (it breaks grpc_objc_bazel_test (see https://github.com/grpc/grpc/pull/30788#issuecomment-1476372187) and also seems to be breaking some other internal stuff).	2 years ago

1 2 3 4 5 ...

539 Commits (47306d78f4ca1df0259972a3408953c4c106edca)