Chiebot-Mirror/grpc - grpc - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Alisha Nanda	f7fc3fbed4	[tracing] Add annotation with metadata sizes and limits (#33910 ) Only create annotation when call is sampled for cost reasons. --------- Co-authored-by: ananda1066 <ananda1066@users.noreply.github.com>	1 year ago
Hannah Shi	cd85d7edf2	[ObjC] dns service resolver for cf event engine (#33233 ) Implement DNS using dns service for iOS. Current limitation: 1. Using a custom name server is not supported. 2. Only supports `LookupHostname`. `LookupSRV` and `LookupTXT` are not implemented. 3. Not tested with single stack (ipv4 or ipv6) environment 4. ~Not tested with multiple ip records per stack~ manually tested with wsj.com 5. Not tested with multiple interface environment	1 year ago
Alisha Nanda	9aca06d38a	Revert "[c-ares DNS resolver] Fix file descriptor use-after-close bug when c-ares writes succeed but subsequent read fails" (#33934 ) Reverts grpc/grpc#33871 due to build failures in google3. Co-authored-by: Yijie Ma <yijiem@google.com>	1 year ago
Yash Tibrewal	010a59b7fd	[keepalive] Allow server side keepalive_permit_without_calls setting to be overridden (#33917 ) Need the ability to override server-side keepalive permit without calls default without affecting client-side settings. <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	1 year ago
apolcyn	76203ba589	[c-ares DNS resolver] Fix file descriptor use-after-close bug when c-ares writes succeed but subsequent read fails (#33871 ) Normally, c-ares related fds are destroyed after all DNS resolution is finished in [this code path](`c82d31677a/src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc (L210)`). Also there are some fds that c-ares may fail to open or write to initially, and c-ares will close them internally before grpc ever knows about them. But if: 1) c-ares opens a socket and successfully writes a request on it 2) then a subsequent read fails Then c-ares will close the fd in [this code path](`bad62225b7/src/lib/ares_process.c (L740)`), but gRPC will have a reference on the fd and will still use it afterwards. Fix here is to leverage the c-ares socket-override API to properly track fd ownership between c-ares and grpc. Related: internal issue b/292203138	1 year ago
AJ Heller	0e9553cf4e	[EventEngine] Add TODOs to re-enable EventEngine end2end tests (#33911 ) These tests should be re-enabled before we claim confidence in the engine implementations. It seems these tests are still being run, not sure if that's true in all cases ([example](https://source.cloud.google.com/results/invocations/be524340-c98f-4915-a833-192047ae9925/targets/%2F%2Ftest%2Fcore%2Fend2end:call_creds_test@experiment%3Devent_engine_client/log)). Alternatively, we can scrap this PR and enable all tests now if you feel you're ready to start looking at PosixEventEngine test failures. CC @yijiem @ctiller	1 year ago
AJ Heller	0897f0faf3	[EventEngine][Windows] Temporary changes for rare-flake debugging (#33894 ) CNR a WindowsEventEngine listener flake in: * 10k local Windows development machine runs * 50k Windows RBE runs * 10k Windows VM runs It fails ~5 times per day on the master CI jobs. This PR adds some logging to try to see if an edge is missed, and switches the thread pool implementation to see if that makes the flake go away. If the flakes disappear, I'll try removing one or the other to see if either independently fix the problem (hopefully not logging). --------- Co-authored-by: drfloob <drfloob@users.noreply.github.com>	1 year ago
Craig Tiller	a008026890	[fuzzing] Increase deadline, fix b/293425905 (#33897 )	1 year ago
Yijie Ma	7524e899d1	Revert "[CI breakage] Skip some dns tests as a temporary workaround" (#33882 ) Reverts grpc/grpc#33819 Verified that it passed these jobs: `grpc/core/master/linux/grpc_basictests_c_cpp_dbg` `grpc/core/master/linux/grpc_basictests_c_cpp_opt` `grpc/core/master/linux/grpc_portability`	1 year ago
Craig Tiller	8c2c35785e	[chttp2] Fix for when global config is overridden via InitGoogleTest (#33885 ) (should unblock the current import)	1 year ago
Craig Tiller	3717ff04ba	[chttp2] Split ping policy from transport (#33703 ) Why: Cleanup for chttp2_transport ahead of promise conversion - lots of logic has become interleaved throughout chttp2, so some effort to isolate logic out is warranted ahead of that conversion. What: Split configuration and policy tracking for each of ping rate throttling and abuse detection into their own modules. Add tests for them. Incidentally: Split channel args into their own header so that we can split the policy stuff into separate build targets. --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago
Craig Tiller	417c8e3499	[fuzzing] Increase deadline, tweak timeouts for b/291372661 (#33766 )	1 year ago
Matthew Stevenson	99b0e54877	[ssl] Disable slow SSL transport security tests for UBSAN builds. (#33824 ) This PR is expected to fix the flakes of `//test/core/tsi:ssl_transport_security_test` when built under UBSAN. Why is this needed? There are several tests in `ssl_transport_security_test.cc` that involve doing many expensive operations and PR #33638 recently added one more (namely, repeatedly signing with an ECDSA key). The slow tests are already altered for MSAN and TSAN, and now we need to do the same for UBSAN.	1 year ago
Vignesh Babu	f4f3a907f3	[import] Fix missing dependency in experiments_tag_test (#33827 )	1 year ago
Yijie Ma	e74b7d8262	[CI breakage] Skip some dns tests as a temporary workaround (#33819 ) Those tests are failing on CIs which do not have twisted installed. Skip them for now and will fix the docker images next. <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	1 year ago
Vignesh Babu	f85b7c79ee	[experiments] Fix processing of platform specific test tags (#33749 ) Also adds a unit test: experiments_tag_test which should fail if the appropriate tags are not set for it.	1 year ago
Yijie Ma	a7bf07e86a	[EventEngine] PosixEventEngine DNS Resolver (#32701 ) This PR implements a c-ares based DNS resolver for EventEngine with the reference from the original [grpc_ares_wrapper.h](../blob/master/src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.h). The PosixEventEngine DNSResolver is implemented on top of that. Tests which use the client channel resolver API ([resolver.h](../blob/master/src/core/lib/resolver/resolver.h#L54)) are ported, namely the [resolver_component_test.cc](../blob/master/test/cpp/naming/resolver_component_test.cc) and the [cancel_ares_query_test.cc](../blob/master/test/cpp/naming/cancel_ares_query_test.cc). The WindowsEventEngine DNSResolver will use the same EventEngine's grpc_ares_wrapper and will be worked on next. The [resolve_address_test.cc](https://github.com/grpc/grpc/blob/master/test/core/iomgr/resolve_address_test.cc) which uses the iomgr [DNSResolver](../blob/master/src/core/lib/iomgr/resolve_address.h#L44) API has been ported to EventEngine's dns_test.cc. That leaves only 2 tests which use iomgr's API, notably the [dns_resolver_cooldown_test.cc](../blob/master/test/core/client_channel/resolvers/dns_resolver_cooldown_test.cc) and the [goaway_server_test.cc](../blob/master/test/core/end2end/goaway_server_test.cc) which probably need to be restructured to use EventEngine DNSResolver (for one thing they override the original grpc_ares_wrapper's free functions). I will try to tackle these in the next step. <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	1 year ago
AJ Heller	112421760a	[EventEngine] Eliminate busy loop in the work stealing lifeguard's shutdown (#33386 ) Co-authored-by: drfloob <drfloob@users.noreply.github.com>	1 year ago
AJ Heller	0155478ae7	[test] Increase timeout for ssl_transport_security_test (#33789 ) This test appears to be timing out more often lately. Example: https://fusion2.corp.google.com/ci/kokoro/prod:grpc%2Fcore%2Fpull_request%2Flinux%2Fbazel_rbe%2Fgrpc_bazel_rbe_ubsan/activity/980ac4a8-da71-4b9b-838e-e9ea235820a1/log	1 year ago
Craig Tiller	5b46c8bdba	[fuzzing] Increase deadline, fix b/291630910 (#33768 )	1 year ago
Craig Tiller	4c7107794d	[promises] Handle the case that a rejection happens without reporting to the app (#33782 ) Promises code can prevent these bad requests from even reaching the application, which is beneficial but this test needs a minor update to handle it. --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago
Craig Tiller	25ed074b0d	[bad_client] Increase timeout (saw this exceeded internally) (#33781 ) Unblocks promises rollout	1 year ago
Craig Tiller	e821494739	[test] Increase deadline after observed failure internally (#33778 ) (needed to unblock promises rollout)	1 year ago
AJ Heller	b33a0781fa	[build] Private visibility for internal EE library (#33764 )	1 year ago
Craig Tiller	112a29c6af	[fuzzing] Increase deadline (#33765 ) Fix b/290782226	1 year ago
Yijie Ma	73605f4eac	[EventEngine] Change `GetDNSResolver` to return `absl::StatusOr<std::unique_ptr<DNSResolver>>` (#33744 ) Based on the discussion at: `595a75cc5d..e3b402a8fa (r1244325752)` <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	1 year ago
Matthew Stevenson	fae2982647	[ssl] Fix SSL stack to handle large handshake messages whose length exceeds the BIO buffer size. (#33638 ) There is a bug in the SSL stack that was only partially fixed in #29176: if more than 17kb is written to the BIO buffer, then everything over 17kb will be discarded, and the SSL handshake will fail with a bad record mac error or hang if not enough bytes have arrived yet. It's relatively uncommon to hit this bug, because the TLS handshake messages need to be much larger than normal for you to have a chance of hitting this bug. However, there was a separate bug in the SSL stack (recently fixed in #33558) that causes the ServerHello produced by a gRPC-C++ TLS server to grow linearly in size with the size of the trust bundle; these 2 bugs combined to cause a large number of TLS handshake failures for gRPC-C++ clients talking to gRPC-C++ servers when the server had a large trust bundle. This PR fixes the bug by ensuring that all bytes are successfully written to the BIO buffer. An initial quick fix for this bug was planned in #33611, but abandoned because we were worried about temporarily doubling the memory footprint of all SSL channels. The complexity in this PR is mostly in the test: it is fairly tricky to force gRPC-C++'s SSL stack to generate a sufficiently large ServerHello to trigger this bug.	1 year ago
Mark D. Roth	083bbee480	[LB policies] revert changes for dualstack design (#33718 ) This reverts the following PRs: #32692 #33087 #33093 #33427 #33568 These changes seem to have introduced some flaky crashes. Reverting while I investigate.	1 year ago
Craig Tiller	e9ba954eef	[owners] Remove CODEOWNERS for ctiller where its no longer necessary (#33704 )	1 year ago
Mark D. Roth	ec39600872	[WRR] fix bugs that caused us to re-enter blackout period upon updates (#33694 ) As per gRFC A58, when WRR sees a subchannel report READY, it reset the non_empty_since value, thus restarting the blackout period. However, there were two cases where we were incorrectly triggering this code: 1. When WRR got an updated address list that contained addresses that were already present on the old list and whose subchannels were already in READY state, the initial notification for those subchannels on the new list was READY, which incorrectly triggered resetting the non_empty_since value. 2. Due to a bug in the outlier_detection policy, whenever an update was propagated down through the OD policy without actually enabling OD, it would incorrectly send a duplicate connectivity state notification for the subchannels. This meant that a subchannel that was already in state READY would report READY again, which would also incorrectly trigger resetting the non_empty_since value. This PR makes two changes: 1. Fix the bug in outlier_detection that caused it to generate the spurious duplicate READY updates. 2. Fix WRR to reset the non_empty_since value when a subchannel goes READY only if the subchannel has seen a previous state update and only if that previous state was not READY. (The duplicate READY notifications should not actually happen anymore now that the OD policy has been fixed, but better to be defensive.) Fixes b/290983884.	1 year ago
Mario Jones Vimal	a934848de5	[core/security] Add Custom Token Lifetime - Service Acc Impersonation (#33351 ) Adds access token lifetime configuration for workload identity federation with service account impersonation for both explicit and implicit flows. Changes: 1. Adds a new member "service_account_impersonation" to the ExternalAccountCredentials class. "token_lifetime_seconds" is a member of "service_account_impersonation". 2. Adds validation checks, like token_lifetime_seconds should be between the minimum and maximum accepted value, during the creation of an ExternalAccountCredentials object. 3. Appends "lifetime" to the body of the service account impersonation request. Tests: 1. Modifies a test to check if the default value is passed when "service_account_impersonation" is empty. 2. Adds tests to check if the token_lifetime_seconds value is propagated to the request body. 3. Adds tests to verify that an error is thrown when token_lifetime_seconds is invalid.	1 year ago
Craig Tiller	7223a9e5fe	[fuzzing] Increase deadline (#33663 ) Fix b/290886936	1 year ago
Craig Tiller	86d7c8125e	[fuzzing] Increase deadline (#33658 ) Resolves b/290812157	1 year ago
Craig Tiller	8845e290db	[filter-test] Enhancements for better testing (#33652 ) - Support call finalizers in filter test. - Add an accessor to the filter implementation from the channel, so that it can be interrogated by tests. - Matcher to ensure that some metadata is not in a metadata batch (functionality needed to support the additional testing we talked about this morning) --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago
Craig Tiller	b7077f4bbf	[hpack] Rollforward huffman read optimization (#33657 ) Rollforward in first commit, fixes in subsequent.	1 year ago
Craig Tiller	57c697d8ae	Revert "[hpack] Huffman read optimization" (#33655 ) Reverts grpc/grpc#33269	1 year ago
Craig Tiller	4ce51fe45d	[hpack] Huffman read optimization (#33269 ) In real services most of our time ends up in the `Read1()` function, which populates one byte into the bit buffer. Change this to read in as many as possible bytes at a time into that buffer. Additionally, generate all possible (to some depth) parser geometries, and add a benchmark for them. Run that benchmark and select the best geometry for decoding base64 strings (since this is the main use-case). (gives about a 30% speed boost parsing base64 then huffman encoded random binary strings) --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago
Vignesh Babu	974798a427	[tracing] Fix flakiness in tcp_posix_test (#33639 ) tcp_posix_test is incorrectly assuming that all endpoint_writes with timestamps enabled will be successfully traced. Remove the timestamps checking related tests to prevent flakes when the test is enabled internally.	1 year ago
Craig Tiller	ed587f2b07	[hpack] Reduce parse table size in the rare case of a parse error (#33637 ) Most of the time parsing succeeds, and only rarely do we see an error. This change reduces the parse memento size from 120 bytes to 56 bytes. --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago
Craig Tiller	dc5c99c9b4	[fuzzing] Increase deadline (#33600 ) Similar pattern to many others.. increase this deadline to have the fuzzer pass. --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago
Craig Tiller	d3d4d5309d	[end2end] Fix fuzzer found deadline bug (#33633 ) fix b/290140776	1 year ago
Craig Tiller	cdfbb0ced7	[end2end] Fix fuzzer found deadline bug (#33629 ) Fixes b/288888511	1 year ago
Craig Tiller	e28729fe0a	[end2end] Fix fuzzer found deadline bug (#33630 ) fix b/288965746	1 year ago
Craig Tiller	f417da77a6	[end2end] Fix fuzzer found deadline bug (#33631 ) fix b/288718007	1 year ago
Craig Tiller	4b7a360041	[end2end] Fix fuzzer found deadline bug (#33632 ) fix b/289593034	1 year ago
nanahpang	0cc9d16e9c	[chaotic-good] Implement a promise-based endpoint for chaotic-good transport to read & write to EventEngine::Endpoint. (#33257 ) This PR is continuing the work of prototyping in https://github.com/grpc/grpc/pull/31592, and the design doc is at [link](https://docs.google.com/document/d/1vRy0yse-d1heLQRmLPo_0figsTPXJAnNN84tBCAne_s/edit?pli=1&resourcekey=0-JvUPdq0LaZq8gMkgT9Pzlw#heading=h.qgvc5vr55ytg). <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->	1 year ago
Yash Tibrewal	9984f1bd5b	Revert "Revert "Revert "Revert "[HTTP2] Fix inconsistencies in keepalive configuration ( #33428 )" (#33512 )"" (#33601 ) Reverts grpc/grpc#33599 Needs to be cherry-picked	1 year ago
Craig Tiller	08f1cc3ba8	[end2end] Explain failures a little better (#33621 ) I'd been adding the following stanza regularly to debug flakes/fuzz failures: ``` Expect(1, CoreEnd2endTest::MaybePerformAction{[&](bool success) { Crash(absl::StrCat( "Unexpected completion of client side call: success=", success ? "true" : "false", " status=", server_status.ToString(), " initial_md=", server_initial_metadata.ToString())); }}); ``` it was helpful because it indicated why a call batch finished successfully and helped quickly identify next steps. It occurred to me however that this would better be done inside of the framework, and for all ops that have outputs, so this PR does just that. Any time a batch with an op that outputs information finishes successfully but unexpectedly we now display those outputs in human readable form in the error message. Sample output: ``` [ RUN ] CorpusExamples/FuzzerCorpusTest.RunOneExample/0 RUN TEST: Http2SingleHopTest.SimpleDelayedRequestShort/Chttp2SimpleSslFullstack E0101 00:00:05.000000000 396633 simple_delayed_request.cc:37] Create client side call E0101 00:00:05.000000000 396633 simple_delayed_request.cc:41] Start initial batch E0101 00:00:05.000000000 396633 simple_delayed_request.cc:47] Start server E0101 00:00:05.000000000 396633 cq_verifier.cc:364] Verify tag(101)-✅ for 600000ms test/core/end2end/cq_verifier.cc:316: Unexpected event: OP_COMPLETE: tag:0x1 OK with: incoming_metadata: {} status_on_client: status=4 msg=Deadline Exceeded trailing_metadata={} checked @ test/core/end2end/tests/simple_delayed_request.cc:51 expected: test/core/end2end/tests/simple_delayed_request.cc:50: tag(101) success=true ```	1 year ago
Craig Tiller	d139c4a014	[metadata] Add an experiment to ensure a unique refcount on parsed slice strings (#33205 ) The intuition here is that these strings may end up in the hpack table, and then unnecessarily extend the lifetime of the read blocks. Instead, take a copy of these short strings when we need to and allow the incoming large memory object to be discarded. --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago
Craig Tiller	c5bb43ab61	[chttp2] Eliminate grpc_chttp2_stream_map (#33503 ) No need for a bespoke type anymore... and a step along the path to C++ification. --------- Co-authored-by: ctiller <ctiller@users.noreply.github.com>	1 year ago

... 5 6 7 8 9 ...

8239 Commits (571da7be784da1bb6d81d5b973aad0c95c61aabc)