[grpclb_e2e_test] fix a couple of flakes (#37677)

One flake is in the `UsePickFirstChildPolicy` test, where PF happens to choose the second backend instead of the first one.  This can happen when the connection attempts are run in parallel due to happy eyeballs and the second one happens to finish before the first one.  Example failure:

https://btx.cloud.google.com/invocations/a48ad3f4-c3fb-4cc7-9af4-dec027504f4a/targets/%2F%2Ftest%2Fcpp%2Fend2end:grpclb_end2end_test@poller%3Dpoll;config=e15de0119d30933c31c54fd53f0490b0130f57af5825b09748361d587df0a7d0/tests

I fixed this by changing the test to require that all traffic goes to one of the two backends instead of requiring that it is the first one.

The other flake is in the `SwapChildPolicy` test, where the traffic is not being evenly distributed to the backends under round robin.  Example failure:

https://btx.cloud.google.com/invocations/57ac3d4c-ce2a-45be-8147-0cbed8dd8786/targets/%2F%2Ftest%2Fcpp%2Fend2end:grpclb_end2end_test@experiment%3Dwork_serializer_dispatch;config=56f5b09615e325097b100b58c41171656571290519a83c5d89a6067ef0283d46/tests

I'm honestly not sure what's causing this, but I don't think the test actually needs to verify the behavior of round_robin in the first place; it's sufficient to just check that traffic is going to both backends to know that we've switched from PF to RR.  So I've simply removed the additional checks here.

Closes #37677

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37677 from markdroth:grpclb_e2e_test_flake 48b78ca8ad
PiperOrigin-RevId: 673525103
pull/37691/head
Mark D. Roth 2 months ago committed by Copybara-Service
parent 04f05a3031
commit e4688edc13
  1. 42
      test/cpp/end2end/grpclb_end2end_test.cc

@ -952,12 +952,19 @@ TEST_F(GrpclbEnd2endTest, UsePickFirstChildPolicy) {
"}");
SendBalancerResponse(BuildResponseForBackends(GetBackendPorts(), {}));
CheckRpcSendOk(kNumRpcs, 3000 /* timeout_ms */, true /* wait_for_ready */);
// Check that all requests went to the first backend. This verifies
// that we used pick_first instead of round_robin as the child policy.
EXPECT_EQ(backends_[0]->service().request_count(), kNumRpcs);
for (size_t i = 1; i < backends_.size(); ++i) {
EXPECT_EQ(backends_[i]->service().request_count(), 0UL);
// Check that all requests went to one backend. This verifies that we
// used pick_first instead of round_robin as the child policy.
bool found = false;
for (size_t i = 0; i < backends_.size(); ++i) {
if (backends_[i]->service().request_count() > 0) {
LOG(INFO) << "backend " << i << " saw traffic";
EXPECT_EQ(backends_[i]->service().request_count(), kNumRpcs)
<< "backend " << i;
EXPECT_FALSE(found) << "multiple backends saw traffic";
found = true;
}
}
EXPECT_TRUE(found) << "no backends saw traffic";
// The balancer got a single request.
EXPECT_EQ(1U, balancer_->service().request_count());
// and sent a single response.
@ -982,21 +989,24 @@ TEST_F(GrpclbEnd2endTest, SwapChildPolicy) {
"}");
SendBalancerResponse(BuildResponseForBackends(GetBackendPorts(), {}));
CheckRpcSendOk(kNumRpcs, 3000 /* timeout_ms */, true /* wait_for_ready */);
// Check that all requests went to the first backend. This verifies
// that we used pick_first instead of round_robin as the child policy.
EXPECT_EQ(backends_[0]->service().request_count(), kNumRpcs);
for (size_t i = 1; i < backends_.size(); ++i) {
EXPECT_EQ(backends_[i]->service().request_count(), 0UL);
// Check that all requests went to one backend. This verifies that we
// used pick_first instead of round_robin as the child policy.
bool found = false;
for (size_t i = 0; i < backends_.size(); ++i) {
if (backends_[i]->service().request_count() > 0) {
LOG(INFO) << "backend " << i << " saw traffic";
EXPECT_EQ(backends_[i]->service().request_count(), kNumRpcs)
<< "backend " << i;
EXPECT_FALSE(found) << "multiple backends saw traffic";
found = true;
}
}
EXPECT_TRUE(found) << "no backends saw traffic";
// Send new resolution that removes child policy from service config.
SetNextResolutionDefaultBalancer();
// We should now be using round_robin, which will send traffic to all
// backends.
WaitForAllBackends();
CheckRpcSendOk(kNumRpcs, 3000 /* timeout_ms */, true /* wait_for_ready */);
// Check that every backend saw the same number of requests. This verifies
// that we used round_robin.
for (size_t i = 0; i < backends_.size(); ++i) {
EXPECT_EQ(backends_[i]->service().request_count(), 2UL);
}
// The balancer got a single request.
EXPECT_EQ(1U, balancer_->service().request_count());
// and sent a single response.

Loading…
Cancel
Save