upstream: handle health check fail after removal (#6765)

When using active health checking, hosts are not removed from
dynamic clusters if they are still passing health checks. This
creates a situation in which hosts might not be removed for a
very long time if the sequence is reversed; removal followed by
health check failure. This change handles the second case so that
any time a host is both removed AND failing active health check,
in any order, it will be removed.

This has been an issue "forever" but is more obvious when using
streaming EDS or very long polling DNS.

Fixes https://github.com/envoyproxy/envoy/issues/6625

Signed-off-by: Matt Klein <mklein@lyft.com>

Mirrored from https://github.com/envoyproxy/envoy @ 41eefffcd728d071037a57a1accd402ec188bcd5
pull/620/head
data-plane-api(CircleCI) 6 years ago
parent 1e6a3ddd6e
commit 429644f1b4
  1. 4
      envoy/admin/v2alpha/clusters.proto

@ -78,6 +78,10 @@ message HostHealthStatus {
// The host is currently being marked as degraded through active health checking.
bool failed_active_degraded_check = 4;
// The host has been removed from service discovery, but is being stabilized due to active
// health checking.
bool pending_dynamic_removal = 5;
// Health status as reported by EDS. Note: only HEALTHY and UNHEALTHY are currently supported
// here.
// TODO(mrice32): pipe through remaining EDS health status possibilities.

Loading…
Cancel
Save