|
|
|
Load Balancing in gRPC
|
|
|
|
======================
|
|
|
|
|
|
|
|
# Scope
|
|
|
|
|
|
|
|
This document explains the design for load balancing within gRPC.
|
|
|
|
|
|
|
|
# Background
|
|
|
|
|
|
|
|
Load-balancing within gRPC happens on a per-call basis, not a
|
|
|
|
per-connection basis. In other words, even if all requests come from a
|
|
|
|
single client, we still want them to be load-balanced across all servers.
|
|
|
|
|
|
|
|
# Architecture
|
|
|
|
|
|
|
|
## Overview
|
|
|
|
|
|
|
|
The gRPC client supports an API that allows load balancing policies to
|
|
|
|
be implemented and plugged into gRPC. An LB policy is responsible for:
|
|
|
|
- receiving updated configuration and list of server addresses from the
|
|
|
|
resolver
|
|
|
|
- creating subchannels for the server addresses and managing their
|
|
|
|
connectivity behavior
|
|
|
|
- setting the overall [connectivity state](connectivity-semantics-and-api.md)
|
|
|
|
(usually computed by aggregating the connectivity states of its subchannels)
|
|
|
|
of the channel
|
|
|
|
- for each RPC sent on the channel, determining which subchannel to send
|
|
|
|
the RPC on
|
|
|
|
|
|
|
|
There are a number of LB policies provided with gRPC. The most
|
|
|
|
notable ones are `pick_first` (the default), `round_robin`, and
|
|
|
|
`grpclb`. There are also a number of additional LB policies to support
|
|
|
|
[xDS](grpc_xds_features.md), although they are not currently configurable
|
|
|
|
directly.
|
|
|
|
|
|
|
|
## Workflow
|
|
|
|
|
|
|
|
Load-balancing policies fit into the gRPC client workflow in between
|
|
|
|
name resolution and the connection to the server. Here's how it all
|
|
|
|
works:
|
|
|
|
|
|
|
|
![image](images/load-balancing.png)
|
|
|
|
|
|
|
|
1. On startup, the gRPC client issues a [name resolution](naming.md) request
|
|
|
|
for the server name. The name will resolve to a list of IP addresses,
|
|
|
|
a [service config](service_config.md) that indicates which client-side
|
|
|
|
load-balancing policy to use (e.g., `round_robin` or `grpclb`) and
|
|
|
|
provides a configuration for that policy, and a set of attributes
|
|
|
|
(channel args in C-core).
|
|
|
|
2. The client instantiates the load balancing policy and passes it its
|
|
|
|
configuration from the service config, the list of IP addresses, and
|
|
|
|
the attributes.
|
|
|
|
3. The load balancing policy creates a set of subchannels for the IP
|
|
|
|
addresses of the servers (which might be different from the IP
|
|
|
|
addresses returned by the resolver; see below). It also watches the
|
|
|
|
subchannels' connectivity states and decides when each subchannel
|
|
|
|
should attempt to connect.
|
|
|
|
4. For each RPC sent, the load balancing policy decides which
|
|
|
|
subchannel (i.e., which server) the RPC should be sent to.
|
|
|
|
|
|
|
|
See below for more information on `grpclb`.
|
|
|
|
|
|
|
|
## Load Balancing Policies
|
|
|
|
|
|
|
|
### `pick_first`
|
|
|
|
|
|
|
|
This is the default LB policy if the service config does not specify any
|
|
|
|
LB policy. It does not require any configuration.
|
|
|
|
|
|
|
|
The `pick_first` policy takes a list of addresses from the resolver. It
|
|
|
|
attempts to connect to those addresses one at a time, in order, until it
|
|
|
|
finds one that is reachable. If none of the addresses are reachable, it
|
|
|
|
sets the channel's state to TRANSIENT_FAILURE while it attempts to
|
|
|
|
reconnect. Appropriate [backoff](connection-backoff.md) is applied for
|
|
|
|
repeated connection attempts.
|
|
|
|
|
|
|
|
If it is able to connect to one of the addresses, it sets the channel's
|
|
|
|
state to READY, and then all RPCs sent on the channel will be sent to
|
|
|
|
that address. If the connection to that address is later broken,
|
|
|
|
the `pick_first` policy will put the channel into state IDLE, and it
|
|
|
|
will not attempt to reconnect until the application requests that it
|
|
|
|
does so (either via the channel's connectivity state API or by sending
|
|
|
|
an RPC).
|
|
|
|
|
|
|
|
### `round_robin`
|
|
|
|
|
|
|
|
This LB policy is selected via the service config. It does not require
|
|
|
|
any configuration.
|
|
|
|
|
|
|
|
This policy takes a list of addresses from the resolver. It creates a
|
|
|
|
subchannel for each of those addresses and constantly monitors the
|
|
|
|
connectivity state of the subchannels. Whenever a subchannel becomes
|
|
|
|
disconnected, the `round_robin` policy will ask it to reconnect, with
|
|
|
|
appropriate connection [backoff](connection-backoff.md).
|
|
|
|
|
|
|
|
The policy sets the channel's connectivity state by aggregating the
|
|
|
|
states of the subchannels:
|
|
|
|
- If any one subchannel is in READY state, the channel's state is READY.
|
|
|
|
- Otherwise, if there is any subchannel in state CONNECTING, the channel's
|
|
|
|
state is CONNECTING.
|
|
|
|
- Otherwise, if there is any subchannel in state IDLE, the channel's state is
|
|
|
|
IDLE.
|
|
|
|
- Otherwise, if all subchannels are in state TRANSIENT_FAILURE, the channel's
|
|
|
|
state is TRANSIENT_FAILURE.
|
|
|
|
|
|
|
|
Note that when a given subchannel reports TRANSIENT_FAILURE, it is
|
|
|
|
considered to still be in TRANSIENT_FAILURE until it successfully
|
|
|
|
reconnects and reports READY. In particular, we ignore the transition
|
|
|
|
from TRANSIENT_FAILURE to CONNECTING.
|
|
|
|
|
|
|
|
When an RPC is sent on the channel, the `round_robin` policy will
|
|
|
|
iterate over all subchannels that are currently in READY state, sending
|
|
|
|
each successive RPC to the next successive subchannel in the list,
|
|
|
|
wrapping around to the start of the list when needed.
|
|
|
|
|
|
|
|
### `grpclb`
|
|
|
|
|
|
|
|
(This policy is deprecated. We recommend using [xDS](grpc_xds_features.md)
|
|
|
|
instead.)
|
|
|
|
|
|
|
|
This LB policy was originally intended as gRPC's primary extensibility
|
|
|
|
mechanism for load balancing. The intent was that instead of adding new
|
|
|
|
LB policies directly in the client, the client could implement only
|
|
|
|
simple algorithms like `round_robin`, and any more complex algorithms
|
|
|
|
would be provided by a look-aside load balancer.
|
|
|
|
|
|
|
|
The client relies on the load balancer to provide _load balancing
|
|
|
|
configuration_ and _the list of server addresses_ to which the client should
|
|
|
|
send requests. The balancer updates the server list as needed to balance
|
|
|
|
the load as well as handle server unavailability or health issues. The
|
|
|
|
load balancer will make any necessary complex decisions and inform the
|
|
|
|
client. The load balancer may communicate with the backend servers to
|
|
|
|
collect load and health information.
|
|
|
|
|
|
|
|
The `grpclb` policy uses the addresses returned by the resolver (if any)
|
|
|
|
as fallback addresses, which are used when it loses contact with the
|
|
|
|
balancers.
|
|
|
|
|
|
|
|
The `grpclb` policy gets the list of addresses of the balancers to talk to
|
|
|
|
via an attribute returned by the resolver.
|