docs: move xds protocol to rst (#6670)

This PR moves the xds protocol from md to rst.

Risk Level: Low
Testing: N/A
Docs Changes: N/A
Release Notes: N/A

Fixes #6338

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>

Mirrored from https://github.com/envoyproxy/envoy @ a3fe3c6ef03ae7386974bc27225700eab1b48a6f
pull/620/head
data-plane-api(CircleCI) 6 years ago
parent 0cb4a00079
commit 59afd49a7a
  1. 396
      XDS_PROTOCOL.md
  2. 456
      xds_protocol.rst

@ -1,396 +0,0 @@
# xDS REST and gRPC protocol
Envoy discovers its various dynamic resources via the filesystem or by querying
one or more management servers. Collectively, these discovery services and their
corresponding APIs are referred to as _xDS_. Resources are requested via
_subscriptions_, by specifying a filesystem path to watch, initiating gRPC
streams or polling a REST-JSON URL. The latter two methods involve sending
requests with a
[`DiscoveryRequest`](https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/discovery.proto#discoveryrequest)
proto payload. Resources are delivered in a
[`DiscoveryResponse`](https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/discovery.proto#discoveryresponse)
proto payload in all methods. We discuss each type of subscription below.
## Filesystem subscriptions
The simplest approach to delivering dynamic configuration is to place it at a
well known path specified in the
[`ConfigSource`](https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/core/config_source.proto#core-configsource).
Envoy will use `inotify` (`kqueue` on macOS) to monitor the file for changes
and parse the `DiscoveryResponse` proto in the file on update. Binary
protobufs, JSON, YAML and proto text are supported formats for the
`DiscoveryResponse`.
There is no mechanism available for filesystem subscriptions to ACK/NACK updates
beyond stats counters and logs. The last valid configuration for an xDS API will
continue to apply if an configuration update rejection occurs.
## Streaming gRPC subscriptions
### Singleton resource type discovery
A gRPC
[`ApiConfigSource`](https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/core/config_source.proto#core-apiconfigsource)
can be specified independently for each xDS API, pointing at an upstream
cluster corresponding to a management server. This will initiate an independent
bidirectional gRPC stream for each xDS resource type, potentially to distinct
management servers. API delivery is eventually consistent. See
[ADS](#aggregated-discovery-service) below for situations in which explicit
control of sequencing is required.
#### Type URLs
Each xDS API is concerned with resources of a given type. There is a 1:1
correspondence between an xDS API and a resource type. That is:
* [LDS: `envoy.api.v2.Listener`](envoy/api/v2/lds.proto)
* [RDS: `envoy.api.v2.RouteConfiguration`](envoy/api/v2/rds.proto)
* [VHDS: `envoy.api.v2.Vhds`](envoy/api/v2/rds.proto)
* [CDS: `envoy.api.v2.Cluster`](envoy/api/v2/cds.proto)
* [EDS: `envoy.api.v2.ClusterLoadAssignment`](envoy/api/v2/eds.proto)
* [SDS: `envoy.api.v2.Auth.Secret`](envoy/api/v2/auth/cert.proto)
The concept of [_type
URLs_](https://developers.google.com/protocol-buffers/docs/proto3#any) appears
below, and takes the form `type.googleapis.com/<resource type>`, e.g.
`type.googleapis.com/envoy.api.v2.Cluster` for CDS. In various requests from
Envoy and responses by the management server, the resource type URL is stated.
#### ACK/NACK and versioning
Each stream begins with a `DiscoveryRequest` from Envoy, specifying the list of
resources to subscribe to, the type URL corresponding to the subscribed
resources, the node identifier and an empty `version_info`. An example EDS request
might be:
```yaml
version_info:
node: { id: envoy }
resource_names:
- foo
- bar
type_url: type.googleapis.com/envoy.api.v2.ClusterLoadAssignment
response_nonce:
```
The management server may reply either immediately or when the requested
resources are available with a `DiscoveryResponse`, e.g.:
```yaml
version_info: X
resources:
- foo ClusterLoadAssignment proto encoding
- bar ClusterLoadAssignment proto encoding
type_url: type.googleapis.com/envoy.api.v2.ClusterLoadAssignment
nonce: A
```
After processing the `DiscoveryResponse`, Envoy will send a new request on the
stream, specifying the last version successfully applied and the nonce provided
by the management server. If the update was successfully applied, the
`version_info` will be __X__, as indicated in the sequence diagram:
![Version update after ACK](diagrams/simple-ack.svg)
In this sequence diagram, and below, the following format is used to abbreviate
messages:
* `DiscoveryRequest`: (V=`version_info`,R=`resource_names`,N=`response_nonce`,T=`type_url`)
* `DiscoveryResponse`: (V=`version_info`,R=`resources`,N=`nonce`,T=`type_url`)
The version provides Envoy and the management server a shared notion of the
currently applied configuration, as well as a mechanism to ACK/NACK
configuration updates. If Envoy had instead rejected configuration update __X__,
it would reply with
[`error_detail`](https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/discovery.proto#envoy-api-field-discoveryrequest-error-detail)
populated and its previous version, which in this case was the empty
initial version. The error_detail has more details around the exact error message
populated in the message field:
![No version update after NACK](diagrams/simple-nack.svg)
Later, an API update may succeed at a new version __Y__:
![ACK after NACK](diagrams/later-ack.svg)
Each stream has its own notion of versioning, there is no shared versioning
across resource types. When ADS is not used, even each resource of a given
resource type may have a
distinct version, since the Envoy API allows distinct EDS/RDS resources to point
at different `ConfigSource`s.
#### When to send an update
The management server should only send updates to the Envoy client when the
resources in the `DiscoveryResponse` have changed. Envoy replies to any
`DiscoveryResponse` with a `DiscoveryRequest` containing the ACK/NACK
immediately after it has been either accepted or rejected. If the management
server provides the same set of resources rather than waiting for a change to
occur, it will cause Envoy and the management server to spin and have a severe
performance impact.
Within a stream, new `DiscoveryRequest`s supersede any prior `DiscoveryRequest`s
having the same resource type. This means that the management server only needs
to respond to the latest `DiscoveryRequest` on each stream for any given resource
type.
#### Resource hints
The `resource_names` specified in the `DiscoveryRequest` are a hint. Some
resource types, e.g. `Cluster`s and `Listener`s will specify an empty
`resource_names` list, since Envoy is interested in learning about all the
`Cluster`s (CDS) and `Listener`s (LDS) that the management server(s) know about
corresponding to its node identification. Other resource types, e.g.
`RouteConfiguration`s (RDS) and `ClusterLoadAssignment`s (EDS), follow from
earlier CDS/LDS updates and Envoy is able to explicitly enumerate these
resources.
LDS/CDS resource hints will always be empty and it is expected that the
management server will provide the complete state of the LDS/CDS resources in
each response. An absent `Listener` or `Cluster` will be deleted.
For EDS/RDS, the management server does not need to supply every requested
resource and may also supply additional, unrequested resources. `resource_names`
is only a hint. Envoy will silently ignore any superfluous resources. When a
requested resource is missing in a RDS or EDS update, Envoy will retain the last
known value for this resource except in the case where the `Cluster` or `Listener`
is being warmed. See [Resource warming](#resource-warming) section below on the expectations
during warming. The management server may be able to infer all
the required EDS/RDS resources from the `node` identification in the
`DiscoveryRequest`, in which case this hint may be discarded. An empty EDS/RDS
`DiscoveryResponse` is effectively a nop from the perspective of the respective
resources in the Envoy.
When a `Listener` or `Cluster` is deleted, its corresponding EDS and RDS
resources are also deleted inside the Envoy instance. In order for EDS resources
to be known or tracked by Envoy, there must exist an applied `Cluster`
definition (e.g. sourced via CDS). A similar relationship exists between RDS and
`Listeners` (e.g. sourced via LDS).
For EDS/RDS, Envoy may either generate a distinct stream for each resource of a
given type (e.g. if each `ConfigSource` has its own distinct upstream cluster
for a management server), or may combine together multiple resource requests for
a given resource type when they are destined for the same management server.
While this is left to implementation specifics, management servers should be capable
of handling one or more `resource_names` for a given resource type in each
request. Both sequence diagrams below are valid for fetching two EDS resources
`{foo, bar}`:
![Multiple EDS requests on the same stream](diagrams/eds-same-stream.svg)
![Multiple EDS requests on distinct streams](diagrams/eds-distinct-stream.svg)
#### Resource updates
As discussed above, Envoy may update the list of `resource_names` it presents to
the management server in each `DiscoveryRequest` that ACK/NACKs a specific
`DiscoveryResponse`. In addition, Envoy may later issue additional
`DiscoveryRequest`s at a given `version_info` to update the management server
with new resource hints. For example, if Envoy is at EDS version __X__ and knows
only about cluster `foo`, but then receives a CDS update and learns about `bar`
in addition, it may issue an additional `DiscoveryRequest` for __X__ with
`{foo,bar}` as `resource_names`.
![CDS response leads to EDS resource hint update](diagrams/cds-eds-resources.svg)
There is a race condition that may arise here; if after a resource hint update
is issued by Envoy at __X__, but before the management server processes the
update it replies with a new version __Y__, the resource hint update may be
interpreted as a rejection of __Y__ by presenting an __X__ `version_info`. To
avoid this, the management server provides a `nonce` that Envoy uses to indicate
the specific `DiscoveryResponse` each `DiscoveryRequest` corresponds to:
![EDS update race motivates nonces](diagrams/update-race.svg)
The management server should not send a `DiscoveryResponse` for any
`DiscoveryRequest` that has a stale nonce. A nonce becomes stale following a
newer nonce being presented to Envoy in a `DiscoveryResponse`. A management
server does not need to send an update until it determines a new version is
available. Earlier requests at a version then also become stale. It may process
multiple `DiscoveryRequests` at a version until a new version is ready.
![Requests become stale](diagrams/stale-requests.svg)
An implication of the above resource update sequencing is that Envoy does not
expect a `DiscoveryResponse` for every `DiscoveryRequest` it issues.
### Resource warming
[`Clusters`](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/cluster_manager.html#cluster-warming)
and [`Listeners`](https://www.envoyproxy.io/docs/envoy/latest/configuration/listeners/lds#config-listeners-lds)
go through `warming` before they can serve requests. This process happens both during
[`Envoy initialization`](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/init.html#initialization)
and when the `Cluster` or `Listener` is updated. Warming of `Cluster` is completed only when a
`ClusterLoadAssignment` response is supplied by management server. Similarly, warming of `Listener`
is completed only when a `RouteConfiguration` is supplied by management server if the listener
refers to an RDS configuration. Management server is expected to provide the EDS/RDS updates during
warming. If management server does not provide EDS/RDS responses, Envoy will not initialize
itself during the initialization phase and the updates sent via CDS/LDS will not take effect until
EDS/RDS responses are supplied.
#### Eventual consistency considerations
Since Envoy's xDS APIs are eventually consistent, traffic may drop briefly
during updates. For example, if only cluster __X__ is known via CDS/EDS,
a `RouteConfiguration` references cluster __X__
and is then adjusted to cluster __Y__ just before the CDS/EDS update
providing __Y__, traffic will be blackholed until __Y__ is known about by the
Envoy instance.
For some applications, a temporary drop of traffic is acceptable, retries at the
client or by other Envoy sidecars will hide this drop. For other scenarios where
drop can't be tolerated, traffic drop could have been avoided by providing a
CDS/EDS update with both __X__ and __Y__, then the RDS update repointing from
__X__ to __Y__ and then a CDS/EDS update dropping __X__.
In general, to avoid traffic drop, sequencing of updates should follow a
`make before break` model, wherein
* CDS updates (if any) must always be pushed first.
* EDS updates (if any) must arrive after CDS updates for the respective clusters.
* LDS updates must arrive after corresponding CDS/EDS updates.
* RDS updates related to the newly added listeners must arrive after CDS/EDS/LDS updates.
* VHDS updates (if any) related to the newly added RouteConfigurations must arrive after RDS updates.
* Stale CDS clusters and related EDS endpoints (ones no longer being
referenced) can then be removed.
xDS updates can be pushed independently if no new clusters/routes/listeners
are added or if it's acceptable to temporarily drop traffic during
updates. Note that in case of LDS updates, the listeners will be warmed
before they receive traffic, i.e. the dependent routes are fetched through
RDS if configured. Clusters are warmed when adding/removing/updating
clusters. On the other hand, routes are not warmed, i.e., the management
plane must ensure that clusters referenced by a route are in place, before
pushing the updates for a route.
### Aggregated Discovery Services (ADS)
It's challenging to provide the above guarantees on sequencing to avoid traffic
drop when management servers are distributed. ADS allow a single management
server, via a single gRPC stream, to deliver all API updates. This provides the
ability to carefully sequence updates to avoid traffic drop. With ADS, a single
stream is used with multiple independent `DiscoveryRequest`/`DiscoveryResponse`
sequences multiplexed via the type URL. For any given type URL, the above
sequencing of `DiscoveryRequest` and `DiscoveryResponse` messages applies. An
example update sequence might look like:
![EDS/CDS multiplexed on an ADS stream](diagrams/ads.svg)
A single ADS stream is available per Envoy instance.
An example minimal `bootstrap.yaml` fragment for ADS configuration is:
```yaml
node:
id: <node identifier>
dynamic_resources:
cds_config: {ads: {}}
lds_config: {ads: {}}
ads_config:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: ads_cluster
static_resources:
clusters:
- name: ads_cluster
connect_timeout: { seconds: 5 }
type: STATIC
hosts:
- socket_address:
address: <ADS management server IP address>
port_value: <ADS management server port>
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
upstream_connection_options:
# configure a TCP keep-alive to detect and reconnect to the admin
# server in the event of a TCP socket disconnection
tcp_keepalive:
...
admin:
...
```
### Incremental xDS
Incremental xDS is a separate xDS endpoint that:
* Allows the protocol to communicate on the wire in terms of resource/resource
name deltas ("Delta xDS"). This supports the goal of scalability of xDS
resources. Rather than deliver all 100k clusters when a single cluster is
modified, the management server only needs to deliver the single cluster
that changed.
* Allows the Envoy to on-demand / lazily request additional resources. For
example, requesting a cluster only when a request for that cluster arrives.
An Incremental xDS session is always in the context of a gRPC bidirectional
stream. This allows the xDS server to keep track of the state of xDS clients
connected to it. There is no REST version of Incremental xDS yet.
In the delta xDS wire protocol, the nonce field is required and used to pair a
[`DeltaDiscoveryResponse`](https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/discovery.proto#deltadiscoveryresponse)
to a [`DeltaDiscoveryRequest`](https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/discovery.proto#deltadiscoveryrequest)
ACK or NACK.
Optionally, a response message level system_version_info is present for
debugging purposes only.
`DeltaDiscoveryRequest` can be sent in 3 situations:
1. Initial message in a xDS bidirectional gRPC stream.
2. As an ACK or NACK response to a previous `DeltaDiscoveryResponse`.
In this case the `response_nonce` is set to the nonce value in the Response.
ACK or NACK is determined by the absence or presence of `error_detail`.
3. Spontaneous `DeltaDiscoveryRequest` from the client.
This can be done to dynamically add or remove elements from the tracked
`resource_names` set. In this case `response_nonce` must be omitted.
In this first example the client connects and receives a first update that it
ACKs. The second update fails and the client NACKs the update. Later the xDS
client spontaneously requests the "wc" resource.
![Incremental session example](diagrams/incremental.svg)
On reconnect the Incremental xDS client may tell the server of its known
resources to avoid resending them over the network. Because no state is assumed
to be preserved from the previous stream, the reconnecting client must provide
the server with all resource names it is interested in.
![Incremental reconnect example](diagrams/incremental-reconnect.svg)
#### Resource names
Resources are identified by a resource name or an alias. Aliases of a resource, if present, can be
identified by the alias field in the resource of a `DeltaDiscoveryResponse`. The resource name will
be returned in the name field in the resource of a `DeltaDiscoveryResponse`.
#### Subscribing to Resources
The client can send either an alias or the name of a resource in the `resource_names_subscribe`
field of a `DeltaDiscoveryRequest` in order to subscribe to a resource. Both the names and aliases
of resources should be checked in order to determine whether the entity in question has been
subscribed to.
A `resource_names_subscribe` field may contain resource names that the server believes the client
is already subscribed to, and furthermore has the most recent versions of. However, the server
*must* still provide those resources in the response; due to implementation details hidden from
the server, the client may have "forgotten" those resources despite apparently remaining subscribed.
#### Unsubscribing from Resources
When a client loses interest in some resources, it will indicate that with the
`resource_names_unsubscribe` field of a `DeltaDiscoveryRequest`. As with `resource_names_subscribe`,
these may be resource names or aliases.
A `resource_names_unsubscribe` field may contain superfluous resource names, which the server
thought the client was already not subscribed to. The server must cleanly process such a request;
it can simply ignore these phantom unsubscriptions.
## REST-JSON polling subscriptions
Synchronous (long) polling via REST endpoints is also available for the xDS
singleton APIs. The above sequencing of messages is similar, except no
persistent stream is maintained to the management server. It is expected that
there is only a single outstanding request at any point in time, and as a result
the response nonce is optional in REST-JSON. The [JSON canonical transform of
proto3](https://developers.google.com/protocol-buffers/docs/proto3#json) is used
to encode `DiscoveryRequest` and `DiscoveryResponse` messages. ADS is not
available for REST-JSON polling.
When the poll period is set to a small value, with the intention of long
polling, then there is also a requirement to avoid sending a `DiscoveryResponse`
[unless a change to the underlying resources has
occurred](#when-to-send-an-update).

@ -0,0 +1,456 @@
xDS REST and gRPC protocol
==========================
Envoy discovers its various dynamic resources via the filesystem or by
querying one or more management servers. Collectively, these discovery
services and their corresponding APIs are referred to as *xDS*.
Resources are requested via *subscriptions*, by specifying a filesystem
path to watch, initiating gRPC streams or polling a REST-JSON URL. The
latter two methods involve sending requests with a :ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>`
proto payload. Resources are delivered in a
:ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>`
proto payload in all methods. We discuss each type of subscription
below.
Filesystem subscriptions
------------------------
The simplest approach to delivering dynamic configuration is to place it
at a well known path specified in the :ref:`ConfigSource <envoy_api_msg_core.ConfigSource>`.
Envoy will use `inotify` (`kqueue` on macOS) to monitor the file for
changes and parse the
:ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>` proto in the file on update.
Binary protobufs, JSON, YAML and proto text are supported formats for
the
:ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>`.
There is no mechanism available for filesystem subscriptions to ACK/NACK
updates beyond stats counters and logs. The last valid configuration for
an xDS API will continue to apply if an configuration update rejection
occurs.
Streaming gRPC subscriptions
----------------------------
Singleton resource type discovery
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A gRPC
:ref:`ApiConfigSource <envoy_api_msg_core.ApiConfigSource>`
can be specified independently for each xDS API, pointing at an upstream
cluster corresponding to a management server. This will initiate an
independent bidirectional gRPC stream for each xDS resource type,
potentially to distinct management servers. API delivery is eventually
consistent. See :ref:`Aggregated Discovery Service` below for
situations in which explicit control of sequencing is required.
Type URLs
^^^^^^^^^
Each xDS API is concerned with resources of a given type. There is a 1:1
correspondence between an xDS API and a resource type. That is:
- LDS: :ref:`envoy.api.v2.Listener <envoy_api_msg_Listener>`
- RDS: :ref:`envoy.api.v2.RouteConfiguration <envoy_api_msg_RouteConfiguration>`
- VHDS: :ref:`envoy.api.v2.Vhds <envoy_api_msg_RouteConfiguration>`
- CDS: :ref:`envoy.api.v2.Cluster <envoy_api_msg_Cluster>`
- EDS: :ref:`envoy.api.v2.ClusterLoadAssignment <envoy_api_msg_ClusterLoadAssignment>`
- SDS: :ref:`envoy.api.v2.Auth.Secret <envoy_api_msg_Auth.Secret>`
The concept of `type URLs <https://developers.google.com/protocol-buffers/docs/proto3#any>`_ appears below, and takes the form
`type.googleapis.com/<resource type>`, e.g.
`type.googleapis.com/envoy.api.v2.Cluster` for CDS. In various
requests from Envoy and responses by the management server, the resource
type URL is stated.
ACK/NACK and versioning
^^^^^^^^^^^^^^^^^^^^^^^
Each stream begins with a
:ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>` from Envoy, specifying
the list of resources to subscribe to, the type URL corresponding to the
subscribed resources, the node identifier and an empty :ref:`version_info <envoy_api_field_DiscoveryRequest.version_info>`.
An example EDS request might be:
.. code:: yaml
version_info:
node: { id: envoy }
resource_names:
- foo
- bar
type_url: type.googleapis.com/envoy.api.v2.ClusterLoadAssignment
response_nonce:
The management server may reply either immediately or when the requested
resources are available with a :ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>`, e.g.:
.. code:: yaml
version_info: X
resources:
- foo ClusterLoadAssignment proto encoding
- bar ClusterLoadAssignment proto encoding
type_url: type.googleapis.com/envoy.api.v2.ClusterLoadAssignment
nonce: A
After processing the :ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>`, Envoy will send a new
request on the stream, specifying the last version successfully applied
and the nonce provided by the management server. If the update was
successfully applied, the :ref:`version_info <envoy_api_field_DiscoveryResponse.version_info>` will be **X**, as indicated
in the sequence diagram:
.. figure:: diagrams/simple-ack.svg
:alt: Version update after ACK
In this sequence diagram, and below, the following format is used to abbreviate messages:
- *DiscoveryRequest*: (V=version_info,R=resource_names,N=response_nonce,T=type_url)
- *DiscoveryResponse*: (V=version_info,R=resources,N=nonce,T=type_url)
The version provides Envoy and the management server a shared notion of
the currently applied configuration, as well as a mechanism to ACK/NACK
configuration updates. If Envoy had instead rejected configuration
update **X**, it would reply with :ref:`error_detail <envoy_api_field_DiscoveryRequest.error_detail>`
populated and its previous version, which in this case was the empty
initial version. The :ref:`error_detail <envoy_api_field_DiscoveryRequest.error_detail>` has more details around the exact
error message populated in the message field:
.. figure:: diagrams/simple-nack.svg
:alt: No version update after NACK
Later, an API update may succeed at a new version **Y**:
.. figure:: diagrams/later-ack.svg
:alt: ACK after NACK
Each stream has its own notion of versioning, there is no shared
versioning across resource types. When ADS is not used, even each
resource of a given resource type may have a distinct version, since the
Envoy API allows distinct EDS/RDS resources to point at different :ref:`ConfigSources <envoy_api_msg_core.ConfigSource>`.
.. _Resource Updates:
When to send an update
^^^^^^^^^^^^^^^^^^^^^^
The management server should only send updates to the Envoy client when
the resources in the :ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>` have changed. Envoy replies
to any :ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>` with a :ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>` containing the
ACK/NACK immediately after it has been either accepted or rejected. If
the management server provides the same set of resources rather than
waiting for a change to occur, it will cause Envoy and the management
server to spin and have a severe performance impact.
Within a stream, new :ref:`DiscoveryRequests <envoy_api_msg_DiscoveryRequest>` supersede any prior
:ref:`DiscoveryRequests <envoy_api_msg_DiscoveryRequest>` having the same resource type. This means that
the management server only needs to respond to the latest
:ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>` on each stream for any given resource type.
Resource hints
^^^^^^^^^^^^^^
The :ref:`resource_names <envoy_api_field_DiscoveryRequest.resource_names>` specified in the :ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>` are a hint.
Some resource types, e.g. `Clusters` and `Listeners` will
specify an empty :ref:`resource_names <envoy_api_field_DiscoveryRequest.resource_names>` list, since Envoy is interested in
learning about all the :ref:`Clusters (CDS) <envoy_api_msg_Cluster>` and :ref:`Listeners (LDS) <envoy_api_msg_Listener>`
that the management server(s) know about corresponding to its node
identification. Other resource types, e.g. :ref:`RouteConfiguration (RDS) <envoy_api_msg_RouteConfiguration>`
and :ref:`ClusterLoadAssignment (EDS) <envoy_api_msg_ClusterLoadAssignment>`, follow from earlier
CDS/LDS updates and Envoy is able to explicitly enumerate these
resources.
LDS/CDS resource hints will always be empty and it is expected that the
management server will provide the complete state of the LDS/CDS
resources in each response. An absent `Listener` or `Cluster` will
be deleted.
For EDS/RDS, the management server does not need to supply every
requested resource and may also supply additional, unrequested
resources. :ref:`resource_names <envoy_api_field_DiscoveryRequest.resource_names>` is only a hint. Envoy will silently ignore
any superfluous resources. When a requested resource is missing in a RDS
or EDS update, Envoy will retain the last known value for this resource
except in the case where the `Cluster` or `Listener` is being
warmed. See :ref:`Resource warming` section below on
the expectations during warming. The management server may be able to
infer all the required EDS/RDS resources from the :ref:`node <envoy_api_msg_Core.Node>`
identification in the :ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>`, in which case this hint may
be discarded. An empty EDS/RDS :ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>` is effectively a
nop from the perspective of the respective resources in the Envoy.
When a `Listener` or `Cluster` is deleted, its corresponding EDS and
RDS resources are also deleted inside the Envoy instance. In order for
EDS resources to be known or tracked by Envoy, there must exist an
applied `Cluster` definition (e.g. sourced via CDS). A similar
relationship exists between RDS and `Listeners` (e.g. sourced via
LDS).
For EDS/RDS, Envoy may either generate a distinct stream for each
resource of a given type (e.g. if each :ref:`ConfigSource <envoy_api_msg_core.ConfigSource>` has its own
distinct upstream cluster for a management server), or may combine
together multiple resource requests for a given resource type when they
are destined for the same management server. While this is left to
implementation specifics, management servers should be capable of
handling one or more :ref:`resource_names <envoy_api_field_DiscoveryRequest.resource_names>` for a given resource type in
each request. Both sequence diagrams below are valid for fetching two
EDS resources `{foo, bar}`:
|Multiple EDS requests on the same stream| |Multiple EDS requests on
distinct streams|
Resource updates
^^^^^^^^^^^^^^^^
As discussed above, Envoy may update the list of :ref:`resource_names <envoy_api_field_DiscoveryRequest.resource_names>` it
presents to the management server in each :ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>` that
ACK/NACKs a specific :ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>`. In addition, Envoy may later
issue additional :ref:`DiscoveryRequests <envoy_api_msg_DiscoveryRequest>` at a given :ref:`version_info <envoy_api_field_DiscoveryRequest.version_info>` to
update the management server with new resource hints. For example, if
Envoy is at EDS version **X** and knows only about cluster ``foo``, but
then receives a CDS update and learns about ``bar`` in addition, it may
issue an additional :ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>` for **X** with `{foo,bar}` as
`resource_names`.
.. figure:: diagrams/cds-eds-resources.svg
:alt: CDS response leads to EDS resource hint update
There is a race condition that may arise here; if after a resource hint
update is issued by Envoy at **X**, but before the management server
processes the update it replies with a new version **Y**, the resource
hint update may be interpreted as a rejection of **Y** by presenting an
**X** :ref:`version_info <envoy_api_field_DiscoveryResponse.version_info>`. To avoid this, the management server provides a
``nonce`` that Envoy uses to indicate the specific :ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>`
each :ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>` corresponds to:
.. figure:: diagrams/update-race.svg
:alt: EDS update race motivates nonces
The management server should not send a :ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>` for any
:ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>` that has a stale nonce. A nonce becomes stale
following a newer nonce being presented to Envoy in a
:ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>`. A management server does not need to send an
update until it determines a new version is available. Earlier requests
at a version then also become stale. It may process multiple
:ref:`DiscoveryRequests <envoy_api_msg_DiscoveryRequest>` at a version until a new version is ready.
.. figure:: diagrams/stale-requests.svg
:alt: Requests become stale
An implication of the above resource update sequencing is that Envoy
does not expect a :ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>` for every :ref:`DiscoveryRequests <envoy_api_msg_DiscoveryRequest>`
it issues.
.. _Resource Warming:
Resource warming
~~~~~~~~~~~~~~~~
:ref:`Clusters <arch_overview_cluster_warming>` and
:ref:`Listeners <config_listeners_lds>`
go through warming before they can serve requests. This process
happens both during :ref:`Envoy initialization <arch_overview_initialization>`
and when the `Cluster` or `Listener` is updated. Warming of
`Cluster` is completed only when a `ClusterLoadAssignment` response
is supplied by management server. Similarly, warming of `Listener` is
completed only when a `RouteConfiguration` is supplied by management
server if the listener refers to an RDS configuration. Management server
is expected to provide the EDS/RDS updates during warming. If management
server does not provide EDS/RDS responses, Envoy will not initialize
itself during the initialization phase and the updates sent via CDS/LDS
will not take effect until EDS/RDS responses are supplied.
Eventual consistency considerations
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Since Envoy's xDS APIs are eventually consistent, traffic may drop
briefly during updates. For example, if only cluster **X** is known via
CDS/EDS, a `RouteConfiguration` references cluster **X** and is then
adjusted to cluster **Y** just before the CDS/EDS update providing
**Y**, traffic will be blackholed until **Y** is known about by the
Envoy instance.
For some applications, a temporary drop of traffic is acceptable,
retries at the client or by other Envoy sidecars will hide this drop.
For other scenarios where drop can't be tolerated, traffic drop could
have been avoided by providing a CDS/EDS update with both **X** and
**Y**, then the RDS update repointing from **X** to **Y** and then a
CDS/EDS update dropping **X**.
In general, to avoid traffic drop, sequencing of updates should follow a
make before break model, wherein:
- CDS updates (if any) must always be pushed first.
- EDS updates (if any) must arrive after CDS updates for the respective clusters.
- LDS updates must arrive after corresponding CDS/EDS updates.
- RDS updates related to the newly added listeners must arrive after CDS/EDS/LDS updates.
- VHDS updates (if any) related to the newly added RouteConfigurations must arrive after RDS updates.
- Stale CDS clusters and related EDS endpoints (ones no longer being referenced) can then be removed.
xDS updates can be pushed independently if no new
clusters/routes/listeners are added or if it's acceptable to temporarily
drop traffic during updates. Note that in case of LDS updates, the
listeners will be warmed before they receive traffic, i.e. the dependent
routes are fetched through RDS if configured. Clusters are warmed when
adding/removing/updating clusters. On the other hand, routes are not
warmed, i.e., the management plane must ensure that clusters referenced
by a route are in place, before pushing the updates for a route.
.. _Aggregated Discovery Service:
Aggregated Discovery Service (ADS)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It's challenging to provide the above guarantees on sequencing to avoid
traffic drop when management servers are distributed. ADS allow a single
management server, via a single gRPC stream, to deliver all API updates.
This provides the ability to carefully sequence updates to avoid traffic
drop. With ADS, a single stream is used with multiple independent
:ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>`/:ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>` sequences multiplexed via the
type URL. For any given type URL, the above sequencing of
:ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>` and :ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>` messages applies. An
example update sequence might look like:
.. figure:: diagrams/ads.svg
:alt: EDS/CDS multiplexed on an ADS stream
A single ADS stream is available per Envoy instance.
An example minimal ``bootstrap.yaml`` fragment for ADS configuration is:
.. code:: yaml
node:
id: <node identifier>
dynamic_resources:
cds_config: {ads: {}}
lds_config: {ads: {}}
ads_config:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: ads_cluster
static_resources:
clusters:
- name: ads_cluster
connect_timeout: { seconds: 5 }
type: STATIC
hosts:
- socket_address:
address: <ADS management server IP address>
port_value: <ADS management server port>
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
upstream_connection_options:
# configure a TCP keep-alive to detect and reconnect to the admin
# server in the event of a TCP socket disconnection
tcp_keepalive:
...
admin:
...
Incremental xDS
~~~~~~~~~~~~~~~
Incremental xDS is a separate xDS endpoint that:
- Allows the protocol to communicate on the wire in terms of
resource/resource name deltas ("Delta xDS"). This supports the goal
of scalability of xDS resources. Rather than deliver all 100k
clusters when a single cluster is modified, the management server
only needs to deliver the single cluster that changed.
- Allows the Envoy to on-demand / lazily request additional resources.
For example, requesting a cluster only when a request for that
cluster arrives.
An Incremental xDS session is always in the context of a gRPC
bidirectional stream. This allows the xDS server to keep track of the
state of xDS clients connected to it. There is no REST version of
Incremental xDS yet.
In the delta xDS wire protocol, the nonce field is required and used to
pair a :ref:`DeltaDiscoveryResponse <envoy_api_msg_DeltaDiscoveryResponse>`
to a :ref:`DeltaDiscoveryRequest <envoy_api_msg_DeltaDiscoveryRequest>`
ACK or NACK. Optionally, a response message level :ref:`system_version_info <envoy_api_field_DeltaDiscoveryResponse.system_version_info>`
is present for debugging purposes only.
:ref:`DeltaDiscoveryRequest <envoy_api_msg_DeltaDiscoveryRequest>` can be sent in the following situations:
- Initial message in a xDS bidirectional gRPC stream.
- As an ACK or NACK response to a previous :ref:`DeltaDiscoveryResponse <envoy_api_msg_DeltaDiscoveryResponse>`. In this case the :ref:`response_nonce <envoy_api_field_DiscoveryRequest.response_nonce>` is set to the nonce value in the Response. ACK or NACK is determined by the absence or presence of :ref:`error_detail <envoy_api_field_DiscoveryRequest.error_detail>`.
- Spontaneous :ref:`DeltaDiscoveryRequests <envoy_api_msg_DeltaDiscoveryRequest>` from the client. This can be done to dynamically add or remove elements from the tracked :ref:`resource_names <envoy_api_field_DiscoveryRequest.resource_names>` set. In this case :ref:`response_nonce <envoy_api_field_DiscoveryRequest.response_nonce>` must be omitted.
In this first example the client connects and receives a first update
that it ACKs. The second update fails and the client NACKs the update.
Later the xDS client spontaneously requests the "wc" resource.
.. figure:: diagrams/incremental.svg
:alt: Incremental session example
On reconnect the Incremental xDS client may tell the server of its known
resources to avoid resending them over the network. Because no state is
assumed to be preserved from the previous stream, the reconnecting
client must provide the server with all resource names it is interested
in.
.. figure:: diagrams/incremental-reconnect.svg
:alt: Incremental reconnect example
Resource names
^^^^^^^^^^^^^^
Resources are identified by a resource name or an alias. Aliases of a
resource, if present, can be identified by the alias field in the
resource of a :ref:`DeltaDiscoveryResponse <envoy_api_msg_DeltaDiscoveryResponse>`. The resource name will be
returned in the name field in the resource of a
:ref:`DeltaDiscoveryResponse <envoy_api_msg_DeltaDiscoveryResponse>`.
Subscribing to Resources
^^^^^^^^^^^^^^^^^^^^^^^^
The client can send either an alias or the name of a resource in the
:ref:`resource_names_subscribe <envoy_api_field_DeltaDiscoveryRequest.resource_names_subscribe>` field of a :ref:`DeltaDiscoveryRequest <envoy_api_msg_DeltaDiscoveryRequest>` in
order to subscribe to a resource. Both the names and aliases of
resources should be checked in order to determine whether the entity in
question has been subscribed to.
A :ref:`resource_names_subscribe <envoy_api_field_DeltaDiscoveryRequest.resource_names_subscribe>` field may contain resource names that the
server believes the client is already subscribed to, and furthermore has
the most recent versions of. However, the server *must* still provide
those resources in the response; due to implementation details hidden
from the server, the client may have "forgotten" those resources despite
apparently remaining subscribed.
Unsubscribing from Resources
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When a client loses interest in some resources, it will indicate that
with the :ref:`resource_names_unsubscribe <envoy_api_field_DeltaDiscoveryRequest.resource_names_unsubscribe>` field of a
:ref:`DeltaDiscoveryRequest <envoy_api_msg_DeltaDiscoveryRequest>`. As with :ref:`resource_names_subscribe <envoy_api_field_DeltaDiscoveryRequest.resource_names_subscribe>`, these
may be resource names or aliases.
A :ref:`resource_names_unsubscribe <envoy_api_field_DeltaDiscoveryRequest.resource_names_unsubscribe>` field may contain superfluous resource
names, which the server thought the client was already not subscribed
to. The server must cleanly process such a request; it can simply ignore
these phantom unsubscriptions.
REST-JSON polling subscriptions
-------------------------------
Synchronous (long) polling via REST endpoints is also available for the
xDS singleton APIs. The above sequencing of messages is similar, except
no persistent stream is maintained to the management server. It is
expected that there is only a single outstanding request at any point in
time, and as a result the response nonce is optional in REST-JSON. The
`JSON canonical transform of
proto3 <https://developers.google.com/protocol-buffers/docs/proto3#json>`__
is used to encode :ref:`DiscoveryRequest <envoy_api_msg_DiscoveryRequest>` and :ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>`
messages. ADS is not available for REST-JSON polling.
When the poll period is set to a small value, with the intention of long
polling, then there is also a requirement to avoid sending a
:ref:`DiscoveryResponse <envoy_api_msg_DiscoveryResponse>` :ref:`unless a change to the underlying resources has
occurred <Resource Updates>`.
.. |Multiple EDS requests on the same stream| image:: diagrams/eds-same-stream.svg
.. |Multiple EDS requests on distinct streams| image:: diagrams/eds-distinct-stream.svg
Loading…
Cancel
Save