This PR avoids a situation in which we were losing track of data plane requests as follows:
1. Management server asks Envoy to track load stats for cluster Foo in a LoadStatsResponse for a 10s period. Envoy resets stats for Foo.
2. After the 10s timer, Envoy responds with Foo's stats, resetting them.
3. Management server asks Envoy to track load stats for cluster Foo in a LoadStatsResponse for a 10s period. Envoy resets stats for Foo.
4. After the 10s timer, Envoy responds with Foo's stats, resetting them.
Between 2 and 3, any stats for Foo requests that arrive were previously unaccounted for. We resolve
this (in a relatively backward compatible way) by not making any protocol changes except to require
Envoy to not reset stats for already tracked clusters.
If we were to design LRS from scratch to avoid this, there are better approaches, e.g. making it a
periodic reporting service rather than request-response, but we probably already have a bunch of
existing users of LRS and don't want to break them.
Rist Level: Low
Testing: Modified load_stats_integration_test.
Signed-off-by: Harvey Tuch <htuch@google.com>
Mirrored from https://github.com/envoyproxy/envoy @ 27c99de910788c2c5ca95a87cee00e55a33da638