mirror of https://github.com/c-ares/c-ares.git
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
237 lines
10 KiB
237 lines
10 KiB
# Features |
|
|
|
- [Dynamic Server Timeout Calculation](#dynamic-server-timeout-calculation) |
|
- [Failed Server Isolation](#failed-server-isolation) |
|
- [Query Cache](#query-cache) |
|
- [DNS 0x20 Query Name Case Randomization](#dns-0x20-query-name-case-randomization) |
|
- [DNS Cookies](#dns-cookies) |
|
- [TCP FastOpen (0-RTT)](#tcp-fastopen-0-rtt) |
|
- [Event Thread](#event-thread) |
|
- [System Configuration Change Monitoring](#system-configuration-change-monitoring) |
|
|
|
|
|
## Dynamic Server Timeout Calculation |
|
|
|
Metrics are stored for every server in time series buckets for both the current |
|
time span and prior time span in 1 minute, 15 minute, 1 hour, and 1 day |
|
intervals, plus a single since-inception bucket (of the server in the c-ares |
|
channel). |
|
|
|
These metrics are then used to calculate the average latency for queries on |
|
each server, which automatically adjusts to network conditions. This average |
|
is then multiplied by 5 to come up with a timeout to use for the query before |
|
re-queuing it. If there is not sufficient data yet to calculate a timeout |
|
(need at least 3 prior queries), then the default of 2000ms is used (or an |
|
administrator-set `ARES_OPT_TIMEOUTMS`). |
|
|
|
The timeout is then adjusted to a minimum bound of 250ms which is the |
|
approximate RTT of network traffic half-way around the world, to account for the |
|
upstream server needing to recurse to a DNS server far away. It is also |
|
bounded on the upper end to 5000ms (or an administrator-set |
|
`ARES_OPT_MAXTIMEOUTMS`). |
|
|
|
If a server does not reply within the given calculated timeout, the next time |
|
the query is re-queued to the same server, the timeout will approximately |
|
double thus leading to adjustments in timeouts automatically when a successful |
|
reply is recorded. |
|
|
|
In order to calculate the optimal timeout, it is highly recommended to ensure |
|
`ARES_OPT_QUERY_CACHE` is enabled with a non-zero `qcache_max_ttl` (which it |
|
is enabled by default with a 3600s default max ttl). The goal is to record |
|
the recursion time as part of query latency as the upstream server will also |
|
cache results. |
|
|
|
This feature requires the c-ares channel to persist for the lifetime of the |
|
application. |
|
|
|
|
|
## Failed Server Isolation |
|
|
|
Each server is tracked for failures relating to consecutive connectivity issues |
|
or unrecoverable response codes. Servers are sorted in priority order based |
|
on this metric. Downed servers will be brought back online either when the |
|
current highest priority has failed, or has been determined to be online when |
|
a query is randomly selected to probe a downed server. |
|
|
|
By default a downed server won't be retried for 5 seconds, and queries will |
|
have a 10% chance of being chosen after this timeframe to test a downed server. |
|
Administrators may customize these settings via `ARES_OPT_SERVER_FAILOVER`. |
|
|
|
In the future we may use independent queries to probe downed servers to not |
|
impact latency of any queries when a server is known to be down. |
|
|
|
`ARES_OPT_ROTATE` or a system configuration option of `rotate` will disable |
|
this feature as servers will be chosen at random. In the future we may |
|
enhance this capability to only randomly choose online servers. |
|
|
|
This feature requires the c-ares channel to persist for the lifetime of the |
|
application. |
|
|
|
|
|
## Query Cache |
|
|
|
Every successful query response, as well as `NXDOMAIN` responses containing |
|
an `SOA` record are cached using the `TTL` returned or the SOA Minimum as |
|
appropriate. This timeout is bounded by the `ARES_OPT_QUERY_CACHE` |
|
`qcache_max_ttl`, which defaults to 1hr. |
|
|
|
The query is cached at the lowest possible layer, meaning a call into |
|
`ares_search_dnsrec()` or `ares_getaddrinfo()` may spawn multiple queries |
|
in order to complete its lookup, each individual backend query result will |
|
be cached. |
|
|
|
Any server list change will automatically invalidate the cache in order to |
|
purge any possible stale data. For example, if `NXDOMAIN` is cached but system |
|
configuration has changed due to a VPN connection, the same query might now |
|
result in a valid response. |
|
|
|
This feature is not expected to cause any issues that wouldn't already be |
|
present due to the upstream DNS server having substantially similar caching |
|
already. However if desired it can be disabled by setting `qcache_max_ttl` to |
|
`0`. |
|
|
|
This feature requires the c-ares channel to persist for the lifetime of the |
|
application. |
|
|
|
|
|
## DNS 0x20 Query Name Case Randomization |
|
|
|
DNS 0x20 is the name of the feature which automatically randomizes the case |
|
of the characters in a UDP query as defined in |
|
[draft-vixie-dnsext-dns0x20-00](https://datatracker.ietf.org/doc/html/draft-vixie-dnsext-dns0x20-00). |
|
|
|
For example, if name resolution is performed for `www.example.com`, the actual |
|
query sent to the upstream name server may be `Www.eXaMPlE.cOM`. |
|
|
|
The reason to randomize case characters is to provide additional entropy in the |
|
query to be able to detect off-path cache poisoning attacks for UDP. This is |
|
not used for TCP connections which are not known to be vulnerable to such |
|
attacks due to their stateful nature. |
|
|
|
Much research has been performed by |
|
[Google](https://groups.google.com/g/public-dns-discuss/c/KxIDPOydA5M) |
|
on case randomization and in general have found it to be effective and widely |
|
supported. |
|
|
|
This feature is disabled by default and can be enabled via `ARES_FLAG_DNS0x20`. |
|
There are some instances where servers do not properly facilitate this feature |
|
and unlike in a recursive resolver where it may be possible to determine an |
|
authoritative server is incapable, its much harder to come to any reliable |
|
conclusion as a stub resolver where the issue resides. Due to the recent wide |
|
deployment of DNS 0x20 in large public DNS servers, it is expected |
|
compatibility will improve rapidly where this feature, in time, may be able |
|
to be enabled by default. |
|
|
|
Another feature which can be used to prevent off-path cache poisoning attacks |
|
is [DNS Cookies](#dns-cookies). |
|
|
|
|
|
## DNS Cookies |
|
|
|
DNS Cookies are are a method of learned mutual authentication between a server |
|
and a client as defined in |
|
[RFC7873](https://datatracker.ietf.org/doc/html/rfc7873), |
|
and [RFC9018](https://datatracker.ietf.org/doc/html/rfc9018). |
|
|
|
This mutual authentication ensures clients are protected from off-path cache |
|
poisoning attacks, and protects servers from being used as DNS amplification |
|
attack sources. Many servers will disable query throttling limits when DNS |
|
Cookies are in use. It only applies to UDP connections. |
|
|
|
Since DNS Cookies are optional and learned dynamically, this is an always-on |
|
feature and will automatically adjust based on the upstream server state. The |
|
only potential issue is if a server has once supported DNS Cookies then stops |
|
supporting them, it must clear a regression timeout of 2 minutes before it can |
|
accept responses without cookies. Such a scenario would be exceedingly rare. |
|
|
|
Interestingly, the large public recursive DNS servers such as provided by |
|
[Google](https://developers.google.com/speed/public-dns/docs/using), |
|
[CloudFlare](https://one.one.one.one/), and |
|
[OpenDNS](https://opendns.com) do not have this feature enabled. That said, |
|
most DNS products like [BIND](https://www.isc.org/bind/) enable DNS Cookies |
|
by default. |
|
|
|
This feature requires the c-ares channel to persist for the lifetime of the |
|
application. |
|
|
|
|
|
## TCP FastOpen (0-RTT) |
|
|
|
TCP Fast Open is defined in [RFC7413](https://datatracker.ietf.org/doc/html/rfc7413) |
|
and enables data to be sent with the TCP SYN packet when establishing the |
|
connection, thus rivaling the performance of UDP. A previous connection must |
|
have already have been established in order to obtain the client cookie to |
|
allow the server to trust the data sent in the first packet and know it was not |
|
an off-path attack. |
|
|
|
TCP FastOpen can only be used with idempotent requests since in timeout |
|
conditions the SYN packet with data may be re-sent which may cause the server |
|
to process the packet more than once. Luckily DNS requests are idempotent by |
|
nature. |
|
|
|
TCP FastOpen is supported on Linux, MacOS, and FreeBSD. Most other systems do |
|
not support this feature, or like on Windows require use of completion |
|
notifications to use it whereas c-ares relies on readiness notifications. |
|
|
|
Supported systems also need to be configured appropriately on both the client |
|
and server systems. |
|
|
|
### Linux TFO |
|
sysctl `net.ipv4.tcp_fastopen`: |
|
- `1` = client only (typically default) |
|
- `2` = server only |
|
- `3` = client and server |
|
|
|
### MacOS TFO |
|
sysctl `net.inet.tcp.fastopen` |
|
- `1` = client only |
|
- `2` = server only |
|
- `3` = client and server (typically default) |
|
|
|
### FreeBSD TFO |
|
sysctl `net.inet.tcp.fastopen.server_enable` (boolean) and |
|
`net.inet.tcp.fastopen.client_enable` (boolean). |
|
|
|
|
|
## Event Thread |
|
|
|
Historic c-ares integrations required integrators to have their own event loop |
|
which would be required to notify c-ares of read and write events for each |
|
socket. It was also required to notify c-ares at the appropriate timeout if |
|
no events had occurred. This could be difficult to do correctly and could |
|
lead to stalls or other issues. |
|
|
|
The Event Thread is currently supported on all systems except DOS which does |
|
not natively support threading (however it could in theory be possible to |
|
enable with something like [FSUpthreads](https://arcb.csc.ncsu.edu/~mueller/pthreads/)). |
|
|
|
c-ares is built by default with threading support enabled, however it may |
|
disabled at compile time. The event thread must also be specifically enabled |
|
via `ARES_OPT_EVENT_THREAD`. |
|
|
|
Using the Event Thread feature also facilitates some other features like |
|
[System Configuration Change Monitoring](#system-configuration-change-monitoring), |
|
and automatically enables the `ares_set_pending_write_cb()` feature to optimize |
|
multi-query writing. |
|
|
|
|
|
## System Configuration Change Monitoring |
|
|
|
The system configuration is automatically monitored for changes to the network |
|
and DNS settings. When a change is detected a thread is spawned to read the |
|
new configuration then apply it to the current c-ares configuration. |
|
|
|
This feature requires the [Event Thread](#event-thread) to be enabled via |
|
`ARES_OPT_EVENT_THREAD`. Otherwise it is up to the integrator to do their own |
|
configuration monitoring and call `ares_reinit()` to reload the system |
|
configuration. |
|
|
|
It is supported on Windows, MacOS, iOS and any system configuration that uses |
|
`/etc/resolv.conf` and similar files such as Linux and FreeBSD. Specifically |
|
excluded are DOS and Android due to missing mechanisms to support such a |
|
feature. On linux file monitoring will result in immediate change detection, |
|
however on other unix-like systems a polling mechanism is used that checks every |
|
30s for changes. |
|
|
|
This feature requires the c-ares channel to persist for the lifetime of the |
|
application.
|
|
|