This PR enables DNS 0x20 as per
https://datatracker.ietf.org/doc/html/draft-vixie-dnsext-dns0x20-00 .
DNS 0x20 adds additional entropy to the request by randomly altering the
case of the DNS question to help prevent cache poisoning attacks.
Google DNS has implemented this support as of 2023, even though this is
a proposed and expired standard from 2008:
https://groups.google.com/g/public-dns-discuss/c/KxIDPOydA5M
There have been documented cases of name server and caching server
non-conformance, though it is expected to become more rare, especially
since Google has started using this.
This can be enabled via the `ARES_FLAG_DNS0x20` flag, which is currently
disabled by default. The test cases do however enable this flag to
validate this feature.
Implementors using this flag will notice that responses will retain the
mixed case, but since DNS names are case-insensitive, any proper
implementation should not be impacted.
There is currently no fallback mechanism implemented as it isn't
immediately clear how this may affect a stub resolver like c-ares where
we aren't querying the authoritative name server, but instead an
intermediate recursive resolver where some domains may return invalid
results while others return valid results, all while querying the same
nameserver. Likely using DNS cookies as suggested by #620 is a better
mechanism to fight cache poisoning attacks for stub resolvers.
TCP queries do not use this feature even if the `ARES_FLAG_DNS0x20` flag
is specified since they are not subject to cache poisoning attacks.
Fixes Issue: #795
Fix By: Brad House (@bradh352)
The header inclusion logic in c-ares is hard to follow. Lets try to
simplify the way it works to make it easier to understand and less
likely to break on new code changes. There's still more work to be done,
but this is a good start at simplifying things.
Fix By: Brad House (@bradh352)
error C1041: cannot open program database '....'; if multiple CL.EXE write to the same .PDB file, please use /FS
might be output in some conditions, add /FS compiler flag to prevent it.
Fixes Issue: #796
Fix By: Brad House (@bradh352)
With very little effort we should be able to determine fairly proper
timeouts we can use based on prior query history. We track in order to
be able to auto-scale when network conditions change (e.g. maybe there
is a provider failover and timings change due to that). Apple appears to
do this within their system resolver in MacOS. Obviously we should have
a minimum, maximum, and initial value to make sure the algorithm doesn't
somehow go off the rails.
Values:
- Minimum Timeout: 250ms (approximate RTT half-way around the globe)
- Maximum Timeout: 5000ms (Recommended timeout in RFC 1123), can be
reduced by ARES_OPT_MAXTIMEOUTMS, but otherwise the bound specified by
the option caps the retry timeout.
- Initial Timeout: User-specified via configuration or
ARES_OPT_TIMEOUTMS
- Average latency multiplier: 5x (a local DNS server returning a cached
value will be quicker than if it needs to recurse so we need to account
for this)
- Minimum Count for Average: 3. This is the minimum number of queries we
need to form an average for the bucket.
Per-server buckets for tracking latency over time (these are ephemeral
meaning they don't persist once a channel is destroyed). We record both
the current timespan for the bucket and the immediate preceding timespan
in case of roll-overs we can still maintain recent metrics for
calculations:
- 1 minute
- 15 minutes
- 1 hr
- 1 day
- since inception
Each bucket contains:
- timestamp (divided by interval)
- minimum latency
- maximum latency
- total time
- count
NOTE: average latency is (total time / count), we will calculate this
dynamically when needed
Basic algorithm for calculating timeout to use would be:
- Scan from most recent bucket to least recent
- Check timestamp of bucket, if doesn't match current time, continue to
next bucket
- Check count of bucket, if its not at least the "Minimum Count for
Average", check the previous bucket, otherwise continue to next bucket
- If we reached the end with no bucket match, use "Initial Timeout"
- If bucket is selected, take ("total time" / count) as Average latency,
multiply by "Average Latency Multiplier", bound by "Minimum Timeout" and
"Maximum Timeout"
NOTE: The timeout calculated may not be the timeout used. If we are
retrying
the query on the same server another time, then it will use a larger
value
On each query reply where the response is legitimate (proper response or
NXDOMAIN) and not something like a server error:
- Cycle through each bucket in order
- Check timestamp of bucket against current timestamp, if out of date
overwrite previous entry with values, clear current values
- Compare current minimum and maximum recorded latency against query
time and adjust if necessary
- Increment "count" by 1 and "total time" by the query time
Other Notes:
- This is always-on, the only user-configurable value is the initial
timeout which will simply re-uses the current option.
- Minimum and Maximum latencies for a bucket are currently unused but
are there in case we find a need for them in the future.
Fixes Issue: #736
Fix By: Brad House (@bradh352)