retry: handle all unhealthy upstreams in other_priority plugin (#4838)

This fixes a bug in the other priority plugin that would cause a crash
when retries were attempted when the upstream had no healthy hosts. The
existing check for no healthy was ineffective due to the "everything is
terrible" fallback in the LoadBalancerBase which sets P0 to 100 when all
the priorities are unhealthy.

The fix is to check for healthy % based on the loads computed in the
plugin, not the ones returned by LoadBalancerBase. When all hosts are
unhealthy, we return the original priority load. This ensures that we
maintain whatever fallback the default LB uses when there are no
unhealthy hosts.

Signed-off-by: Snow Pettersen snowp@squareup.com

Risk Level: Medium
Testing: Added regression test for no unhealthy hosts
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Snow Pettersen <snowp@squareup.com>

Mirrored from https://github.com/envoyproxy/envoy @ 59816a486c64cd05e9e0c0f08194b121690d6632
pull/620/head
data-plane-api(CircleCI) 6 years ago
parent 57012d4c64
commit 981878b844
  1. 3
      envoy/config/retry/other_priority/other_priority_config.proto

@ -21,6 +21,9 @@ package envoy.config.retry.other_priority;
// Attempt 3: P0 (no healthy priorities, reset)
// Attempt 4: P2
//
// In the case of all upstream hosts being unhealthy, no adjustments will be made to the original
// priority load, so behavior should be identical to not using this plugin.
//
// Using this PriorityFilter requires rebuilding the priority load, which runs in O(# of
// priorities), which might incur significant overhead for clusters with many priorities.
message OtherPriorityConfig {

Loading…
Cancel
Save