diff --git a/.reuse/dep5 b/.reuse/dep5 index ebdc4dc4..f0a51377 100644 --- a/.reuse/dep5 +++ b/.reuse/dep5 @@ -9,7 +9,7 @@ Copyright: The c-ares project and its contributors. License: MIT # Docs -Files: AUTHORS CONTRIBUTING.md GIT-INFO README.md README.msvc RELEASE-PROCEDURE.md RELEASE-NOTES.md SECURITY.md DEVELOPER-NOTES.md INSTALL.md test/README.md test/FUZZING.md src/lib/include/README.md src/lib/thirdparty/apple/README.md +Files: AUTHORS CONTRIBUTING.md GIT-INFO README.md README.msvc RELEASE-PROCEDURE.md RELEASE-NOTES.md FEATURES.md SECURITY.md DEVELOPER-NOTES.md INSTALL.md test/README.md test/FUZZING.md src/lib/include/README.md src/lib/thirdparty/apple/README.md Copyright: The c-ares project and its contributors. License: MIT diff --git a/FEATURES.md b/FEATURES.md new file mode 100644 index 00000000..4532d1f2 --- /dev/null +++ b/FEATURES.md @@ -0,0 +1,238 @@ +# Features + +Information about a few features in c-ares which can provide insight into +behavior and security of the system, and what tunables may be used to tweak +operation. + +- [Dynamic Server Timeout Calculation](#dynamic-server-timeout-calculation) +- [Failed Server Isolation](#failed-server-isolation) +- [Query Cache](#query-cache) +- [DNS 0x20 Query Name Case Randomization](#dns-0x20-query-name-case-randomization) +- [DNS Cookies](#dns-cookies) +- [TCP FastOpen (0-RTT)](#tcp-fastopen) +- [Event Thread](#event-thread) +- [System Configuration Change Monitoring](#system-configuration-change-monitoring) + + +## Dynamic Server Timeout Calculation + +Metrics are stored for every server in time series buckets for both the current +time span and prior time span in 1 minute, 15 minute, 1 hour, and 1 day +intervals, plus a single since-inception bucket (of the server in the c-ares +channel). + +These metrics are then used to calculate the average latency for queries on +each server, which automatically adjusts to network conditions. This average +is then multiplied by 5 to come up with a timeout to use for the query before +re-queuing it. If there is not sufficient data yet to calculate a timeout +(need at least 3 prior queries), then the default of 2000ms is used (or an +administrator-set `ARES_OPT_TIMEOUTMS`). + +The timeout is then adjusted to a minimum bound of 250ms which is the +approximate RTT of network traffic half-way around the world, to account for the +upstream server needing to recurse to a DNS server far away. It is also +bounded on the upper end to 5000ms (or an administrator-set +`ARES_OPT_MAXTIMEOUTMS`). + +If a server does not reply within the given calculated timeout, the next time +the query is re-queued to the same server, the timeout will approximately +double thus leading to adjustments in timeouts automatically when a successful +reply is recorded. + +In order to calculate the optimal timeout, it is highly recommended to ensure +`ARES_OPT_QUERY_CACHE` is enabled with a non-zero `qcache_max_ttl` (which it +is enabled by default with a 3600s default max ttl). The goal is to record +the recursion time as part of query latency as the upstream server will also +cache results. + +This feature requires the c-ares channel to persist for the lifetime of the +application. + + +## Failed Server Isolation + +Each server is tracked for failures relating to consecutive connectivity issues +or unrecoverable response codes. Servers are sorted in priority order based +on this metric. Downed servers will be brought back online either when the +current highest priority has failed, or has been determined to be online when +a query is randomly selected to probe a downed server. + +By default a downed server won't be retried for 5 seconds, and queries will +have a 10% chance of being chosen after this timeframe to test a downed server. +Administrators may customize these settings via `ARES_OPT_SERVER_FAILOVER`. + +In the future we may use independent queries to probe downed servers to not +impact latency of any queries when a server is known to be down. + +`ARES_OPT_ROTATE` or a system configuration option of `rotate` will disable +this feature as servers will be chosen at random. In the future we may +enhance this capability to only randomly choose online servers. + +This feature requires the c-ares channel to persist for the lifetime of the +application. + + +## Query Cache + +Every successful query response, as well as `NXDOMAIN` responses containing +an `SOA` record are cached using the `TTL` returned or the SOA Minimum as +appropriate. This timeout is bounded by the `ARES_OPT_QUERY_CACHE` +`qcache_max_ttl`, which defaults to 1hr. + +The query is cached at the lowest possible layer, meaning a call into +`ares_search_dnsrec()` or `ares_getaddrinfo()` may spawn multiple queries +in order to complete its lookup, each individual backend query result will +be cached. + +Any server list change will automatically invalidate the cache in order to +purge any possible stale data. For example, if `NXDOMAIN` is cached but system +configuration has changed due to a VPN connection, the same query might now +result in a valid response. + +This feature is not expected to cause any issues that wouldn't already be +present due to the upstream DNS server having substantially similar caching +already. However if desired it can be disabled by setting `qcache_max_ttl` to +`0`. + +This feature requires the c-ares channel to persist for the lifetime of the +application. + + +## DNS 0x20 Query Name Case Randomization + +DNS 0x20 is the name of the feature which automatically randomizes the case +of the characters in a UDP query as defined in +[draft-vixie-dnsext-dns0x20-00](https://datatracker.ietf.org/doc/html/draft-vixie-dnsext-dns0x20-00). + +For example, if name resolution is performed for `www.example.com`, the actual +query sent to the upstream name server may be `Www.eXaMPlE.cOM`. + +The reason to randomize case characters is to provide additional entropy in the +query to be able to detect off-path cache poisoning attacks for UDP. This is +not used for TCP connections which are not known to be vulnerable to such +attacks due to their stateful nature. + +Much research has been performed by +[Google](https://groups.google.com/g/public-dns-discuss/c/KxIDPOydA5M) +on case randomization and in general have found it to be effective and widely +supported. + +This feature is disabled by default and can be enabled via `ARES_FLAG_DNS0x20`. +There are some instances where servers do not properly facilitate this feature +and unlike in a recursive resolver where it may be possible to determine an +authoritative server is incapable, its much harder to come to any reliable +conclusion as a stub resolver where the issue resides. Due to the recent wide +deployment of DNS 0x20 in large public DNS servers, it is expected +compatibility will improve rapidly where this feature, in time, may be able +to be enabled by default. + +Another feature which can be used to prevent off-path cache poisoning attacks +is [DNS Cookies](#dns-cookies). + + +## DNS Cookies + +DNS Cookies are are a method of learned mutual authentication between a server +and a client as defined in +[RFC7873](https://datatracker.ietf.org/doc/html/rfc7873), +and [RFC9018](https://datatracker.ietf.org/doc/html/rfc9018). + +This mutual authentication ensures clients are protected from off-path cache +poisioning attacks, and protects servers from being used as DNS amplification +attack sources. Many servers will disable query throttling limits when DNS +Cookies are in use. It only applies to UDP connections. + +Since DNS Cookies are optional and learned dynamically, this is an always-on +feature and will automatically adjust based on the upstream server state. The +only potential issue is if a server has once supported DNS Cookies then stops +supporting them, it must clear a regression timeout of 2 minutes before it can +accept responses without cookies. Such a scenario would be exceedingly rare. + +Interestingly, the large public recursive DNS servers such as provided by +[Google](https://developers.google.com/speed/public-dns/docs/using), +[CloudFlare](https://one.one.one.one/), and +[OpenDNS](https://opendns.com) do not have this feature enabled. That said, +most DNS products like [BIND](https://www.isc.org/bind/) enable DNS Cookies +by default. + +This feature requires the c-ares channel to persist for the lifetime of the +application. + + +## TCP FastOpen (0-RTT) + +TCP Fast Open is defined in [RFC7413](https://datatracker.ietf.org/doc/html/rfc7413) +and enables data to be sent with the TCP SYN packet when establishing the +connection, thus rivaling the performance of UDP. A previous connection must +have already have been established in order to obtain the client cookie to +allow the server to trust the data sent in the first packet and know it was not +an off-path attack. + +TCP FastOpen can only be used with indemoptent requests since in timeout +conditions the SYN packet with data may be re-sent which may cause the server +to process the packet more than once. Luckily DNS requests are idemoptent. + +TCP FastOpen is supported on Linux, MacOS, and FreeBSD. Most other systems do +not support this feature, or like on Windows require use of completion +notifications to use it whereas c-ares relies on readiness notifications. + +Supported systems also need to be configured appropriately on both the client +and server systems. + +### Linux +sysctl `net.ipv4.tcp_fastopen`: + - `1` = client only (typically default) + - `2` = server only + - `3` = client and server + +### MacOS +sysctl `net.inet.tcp.fastopen` + - `1` = client only + - `2` = server only + - `3` = client and server (typically default) + +### FreeBSD +sysctl `net.inet.tcp.fastopen.server_enable` (boolean) and +`net.inet.tcp.fastopen.client_enable` (boolean). + + +## Event Thread + +Historic c-ares integrations required integrators to have their own event loop +which would be required to notify c-ares of read and write events for each +socket. It was also required to notify c-ares at the appropriate timeout if +no events had occurred. This could be difficult to do correctly and could +lead to stalls or other issues. + +The Event Thread is currently supported on all systems except DOS which does +not natively support threading (however it could in theory be possible to +enable with something like [FSUpthreads](https://arcb.csc.ncsu.edu/~mueller/pthreads/)). + +c-ares is built by default with threading support enabled, however it may +disabled at compile time. The event thread must also be specifically enabled +via `ARES_OPT_EVENT_THREAD`. + +Using the Event Thread feature also facilitates some other features like +[System Configuration Change Monitoring](#system-configuration-change-monitoring), +and automatically enables the `ares_set_pending_write_cb()` feature to optimize +multi-query writing. + + +## System Configuration Change Monitoring + +The system configuration is automatically monitored for changes to the network +and DNS settings. When a change is detected a thread is spawned to read the +new configuration then apply it to the current c-ares configuration. + +This feature requires the [Event Thread](#event-thread) to be enabled via +`ARES_OPT_EVENT_THREAD`. Otherwise it is up to the integrator to do their own +configuration monitoring and call `ares_reinit()` to reload the system +configuration. + +It is supported on Windows, MacOS, iOS and any system configuration that uses +`/etc/resolv.conf` and similar files such as Linux and FreeBSD. Specifically +excluded are DOS and Android due to missing mechanisms to support such a +feature. + +This feature requires the c-ares channel to persist for the lifetime of the +application. diff --git a/README.md b/README.md index f675ad95..40190604 100644 --- a/README.md +++ b/README.md @@ -109,6 +109,9 @@ gpg: binary signature, digest algorithm SHA512, key algorithm rsa2048 ``` ## Features + +See [Features](FEATURES.md) + ### Supported RFCs and Proposals - [RFC1035](https://datatracker.ietf.org/doc/html/rfc1035). Initial/Base DNS RFC