As per #738, there are usecases where the DNS TXT record strings should
not be concatenated like RFC 7208 indicates. We cannot break ABI with
those using the new API, so we need to support retrieving the
concatenated version as well as a new API to retrieve the individual
strings which will be used by `ares_parse_text_reply_ext()` to restore
the old behavior prior to c-ares 1.20.
Fixes Issue: #738
Fix By: Brad House (@bradh352)
Use the unicode version of `RegOpenKeyEx` to avoid compilation error
caused by string type like:
```
../../node/deps/cares/src/lib/ares_event_configchg.c(322,9): error: incompatible pointer types passing 'char[62]' to parameter of type 'LPCWSTR' (aka 'const unsigned short *') [-Werror,-Wincompatible-pointer-types]
322 | "SYSTEM\\CurrentControlSet\\Services\\Tcpip\\Parameters\\Interfaces",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
Fix By: Cheng (@zcbenz)
This PR enables DNS 0x20 as per
https://datatracker.ietf.org/doc/html/draft-vixie-dnsext-dns0x20-00 .
DNS 0x20 adds additional entropy to the request by randomly altering the
case of the DNS question to help prevent cache poisoning attacks.
Google DNS has implemented this support as of 2023, even though this is
a proposed and expired standard from 2008:
https://groups.google.com/g/public-dns-discuss/c/KxIDPOydA5M
There have been documented cases of name server and caching server
non-conformance, though it is expected to become more rare, especially
since Google has started using this.
This can be enabled via the `ARES_FLAG_DNS0x20` flag, which is currently
disabled by default. The test cases do however enable this flag to
validate this feature.
Implementors using this flag will notice that responses will retain the
mixed case, but since DNS names are case-insensitive, any proper
implementation should not be impacted.
There is currently no fallback mechanism implemented as it isn't
immediately clear how this may affect a stub resolver like c-ares where
we aren't querying the authoritative name server, but instead an
intermediate recursive resolver where some domains may return invalid
results while others return valid results, all while querying the same
nameserver. Likely using DNS cookies as suggested by #620 is a better
mechanism to fight cache poisoning attacks for stub resolvers.
TCP queries do not use this feature even if the `ARES_FLAG_DNS0x20` flag
is specified since they are not subject to cache poisoning attacks.
Fixes Issue: #795
Fix By: Brad House (@bradh352)
The header inclusion logic in c-ares is hard to follow. Lets try to
simplify the way it works to make it easier to understand and less
likely to break on new code changes. There's still more work to be done,
but this is a good start at simplifying things.
Fix By: Brad House (@bradh352)
With very little effort we should be able to determine fairly proper
timeouts we can use based on prior query history. We track in order to
be able to auto-scale when network conditions change (e.g. maybe there
is a provider failover and timings change due to that). Apple appears to
do this within their system resolver in MacOS. Obviously we should have
a minimum, maximum, and initial value to make sure the algorithm doesn't
somehow go off the rails.
Values:
- Minimum Timeout: 250ms (approximate RTT half-way around the globe)
- Maximum Timeout: 5000ms (Recommended timeout in RFC 1123), can be
reduced by ARES_OPT_MAXTIMEOUTMS, but otherwise the bound specified by
the option caps the retry timeout.
- Initial Timeout: User-specified via configuration or
ARES_OPT_TIMEOUTMS
- Average latency multiplier: 5x (a local DNS server returning a cached
value will be quicker than if it needs to recurse so we need to account
for this)
- Minimum Count for Average: 3. This is the minimum number of queries we
need to form an average for the bucket.
Per-server buckets for tracking latency over time (these are ephemeral
meaning they don't persist once a channel is destroyed). We record both
the current timespan for the bucket and the immediate preceding timespan
in case of roll-overs we can still maintain recent metrics for
calculations:
- 1 minute
- 15 minutes
- 1 hr
- 1 day
- since inception
Each bucket contains:
- timestamp (divided by interval)
- minimum latency
- maximum latency
- total time
- count
NOTE: average latency is (total time / count), we will calculate this
dynamically when needed
Basic algorithm for calculating timeout to use would be:
- Scan from most recent bucket to least recent
- Check timestamp of bucket, if doesn't match current time, continue to
next bucket
- Check count of bucket, if its not at least the "Minimum Count for
Average", check the previous bucket, otherwise continue to next bucket
- If we reached the end with no bucket match, use "Initial Timeout"
- If bucket is selected, take ("total time" / count) as Average latency,
multiply by "Average Latency Multiplier", bound by "Minimum Timeout" and
"Maximum Timeout"
NOTE: The timeout calculated may not be the timeout used. If we are
retrying
the query on the same server another time, then it will use a larger
value
On each query reply where the response is legitimate (proper response or
NXDOMAIN) and not something like a server error:
- Cycle through each bucket in order
- Check timestamp of bucket against current timestamp, if out of date
overwrite previous entry with values, clear current values
- Compare current minimum and maximum recorded latency against query
time and adjust if necessary
- Increment "count" by 1 and "total time" by the query time
Other Notes:
- This is always-on, the only user-configurable value is the initial
timeout which will simply re-uses the current option.
- Minimum and Maximum latencies for a bucket are currently unused but
are there in case we find a need for them in the future.
Fixes Issue: #736
Fix By: Brad House (@bradh352)
MSVC has been building with /W3 which isn't considered a safe level for
modern code. /W4 is recommended, but it too is lacking some recommended
options, so we enable /W4 and also the recommended options. We do,
however, have to disable a couple of options due to Windows headers not
being fully compliant sometimes as well as some things we do in c-ares
that it doesn't like, but aren't actually bad.
Fix By: Brad House (@bradh352)
The query cache should be enabled by default. This will help with
determining proper timeouts for #736. It can still be disabled by
setting the ttl to 0. There should be no negative consequences of this
in real-world scenarios since DNS is based on the TTL concept and
upstream servers will cache results and not recurse based on this
information anyhow. DNS queries and responses are very small, this
should have negligible impact on memory consumption.
Fix By: Brad House (@bradh352)
On some broken MacOS SDK versions, if HAVE_UNISTD_H is defined, it
must be defined to 1, otherwise it will cause build errors. Autotools
always defines to 1, but CMake doesn't.
Fixes Issue: #787
Fix By: Brad House (@bradh352)
The `NotifyIpInterfaceChange()` method only notifies of automatic
changes, such as via DHCP or interfaces going up or down. However,
someone editing the DNS settings manually would not trigger a
notification.
Use `RegNotifyChangeKeyValue()` to monitor
`HKLM\SYSTEM\CurrentControlSet\Services\Tcpip6\Parameters\Interfaces`
and `HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces`
to pick up these changes.
This new check also works when using the OpenWatcom compiler as the
symbols are available, so they have been enabled (though
`NotifyIpInterfaceChange()` method is disabled for OpenWatcom still due
to the linker libraries missing this symbol).
NOTE: it does appear that about 5 events get triggered for some reason,
it doesn't seem to hurt anything, but it will actually cause it to
re-read the configuration multiple times.
Fixes Issue: #761
Fix By: Brad House (@bradh352)
Add code annotations for ignoring specific code paths for coverage
calculations. The primary purpose of this is to make it easy to see the
code paths that we could (and probably should) write test cases for, as
these would have the most impact on delivery of a stable product.
The annotations used are:
`LCOV_EXCL_LINE: <designation>`, `LCOV_EXCL_START: <designation>`,
`LCOV_EXCL_STOP`
Unfortunately `LCOV_EXCL_BR_LINE` does not appear to be supported by
coveralls as it would have been a more elegant solution over START/STOP.
We specifically include the `<designation>` not just for future
reference but because it makes it easy to identify in case we want to
address these conditions in a different way in the future.
The main areas designated for exclusion are:
1. `OutOfMemory` - these are hard to test cases, and on modern systems,
are likely to never occur due to optimistic memory allocations, which
can then later cause the kernel to terminate your application due to
memory not actually being available. c-ares does have *some* testing
framework for this, if we wish to expand in the future, we can easily
use sed to get rid of of these annotations.
2. `DefensiveCoding` - these are impossible to reach paths at the point
in time the code was written. They are there for defensive coding in
case code is refactored in the future to prevent unexpected behavior.
3. `UntestablePath` - these are code paths that aren't possible to test,
such as failure of a system call.
4. `FallbackCode` - This is an entire set of code that is untestable
because its not able to simulate a failure of the primary path.
This PR also does add some actual coverage in the test cases where it is
easy to do.
Fix By: Brad House (@bradh352)
MacOS 10.8-10.11 won't compile with dnsinfo.h from configd-1109.140.1,
back off to configd-963.50.8 which still works with newer MacOS verions
Fixes Issue: #783
Fix By: Brad House (@bradh352)
Watt-32 (https://www.watt-32.net/) support has been broken for a long
time. Patch c-ares to fix Watt-32 support and also gets rid of the
`WIN32` macro which adds confusion, only use `USE_WINSOCK` macro.
Add a CI/CD task to build c-ares on Windows using MSVC with Watt-32.
Fixes Issue: #780
Fix By: Brad House (@bradh352)
With the current c-ares parser, as per PR #765 parsing was broken due to
validation that didn't understand the `SIG` record class. This PR adds
basic, non validating, and incomplete support for the `SIG` record type.
The additional `KEY` and `NXT` which would be required for additional
verification of the records is not implemented. It also does not store
the raw unprocessed RR data that would be required for the validation.
The primary purpose of this PR is to be able to recognize the record and
handle some periphery aspects such as validation of the class associated
with the RR and to not honor the TTL in the RR in the c-ares query cache
since it will always be 0.
Fixes#765
Fix By: Brad House (@bradh352)
The query cache wasn't properly flushing on server list changes
and was attempting to flush even when the server list didn't
actually change.
Fix By: Brad House (@bradh352)