systemd-resolved will return `ESERVFAIL` or `EREFUSED` by default on
single label domain names. See
https://github.com/systemd/systemd/issues/34101
They've basically labeled this as a non-issue even though it appears to
be in violation of the RFCs for downstream systems being able to support
negative caching. I haven't tested with the suggested
`ResolveUnicastSingleLabel=yes`, but that's off by default so its
unlikely people even knows this exists.
Since systemd is very prevalent these days, we really don't have much of
a choice but to work around their design decisions. This PR implements
this support, but *only* on single label names as it likely isn't valid
otherwise. It also adds test cases to confirm this behavior.
Fixes#852
Authored-By: Brad House (@bradh352)
As per #852 searching is failing, partially it is due to the ndots value
not defaulting to a proper value on linux, and partially due to
systemd-resolved returning the wrong error codes.
This PR fixes the first issue and adds containerized test cases to
validate the behavior and prevent issues in the future.
Reported-By: Hans-Christian Egtvedt (@egtvedt) and Mikael Lindemann(@mikaellindemann)
Authored-By: Brad House (@bradh352)
The tools like `adig` and `ahost` can take advantage of some of the
library functions within c-ares. We can't really split these library
functions into a helper library without possibly breaking users.
Technically if we could guarantee they rely on pkg-config or cmake
helpers we could make it work ... we can revisit that in the future,
maybe if we make a c-ares 2.0 where people might expect more breakage
(but I still wouldn't want to break API/ABI).
So what this does is it just exports some symbols in headers that aren't
distributed so end users aren't meant to use the symbols, but any
utilities or tests built by c-ares can. It does clutter up some of the
namespace, but all these symbols are guaranteed to be prefixed with
`ares_` so there shouldn't ever be symbol clashes due to this for end
users and its not so many symbols that it should affect any load times.
There will be **zero** API/ABI guarantees for these symbols. They can
change from release to release, this is ok since they are not public.
I'm not entirely thrilled with doing this solution, but I want to avoid
thing like hand-written parsers, such as is used in #856.
Authored-By: Brad House (@bradh352)
If a blank DNS name is used, the DNS query cache would fail due to an
invalid sanity check. This can be legitimate such as:
adig -t SOA .
This fixes that situation as well as a few other spots that were
uncovered and adds a test case to validate the behavior to ensure
it won't regress in the future.
Fixes#858
Reported-By: Nodar Chkuaselidze (@nodech)
Authored-By: Brad House (@bradh352)
The FNV1a implementation used in the hashtable threw away the
algorithm's offset basis in preference of the seed value to
ensure different instances of the hashtable produced different
output for the same keys to prevent against hash collision
attacks.
Throwing away the offset basis completely could lead to some
non-uniformity, so instead we are now XORing the seed and the
algorithm-defined offset basis as the new offset basis to use
which should rectify this.
In the current c-ares use cases, this will probably have little
to no impact, but future hash table uses may benefit from the
better distribution.
Authored-By: Brad House (@bradh352)
As per https://llvm.org/docs/LibFuzzer.html#fuzzer-friendly-build-mode
it is recommended to add a compiler flag of FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
to make it deterministic. This can aid in generating a smaller corpus
and make it more efficient in finding actual flaws.
Right now this primarily impacts:
- Anything using the ares random number generator, like ares__slist_*(),
DNS transaction ids, and dns client cookie values.
- ares__htable_*()
Authored-By: Brad House (@bradh352)
The main purpose of this PR is to modularize the communication library,
and streamline the code paths for TCP and UDP into a single flow. This
will help us add additional flows such as TLS, and also make sure these
communication functions return a known set of error codes so that
additional error codes can be added for new flows in the future.
It also adds a new optional callback of
`ares_set_notify_pending_write_callback()` that will assist in
aggregating data from multiple queries into a single write operation.
This doesn't apply to UDP, but on TCP and especially TLS in the future
this can be a significant win. This is automatically applied for the
Event Thread.
It also fixes a long standing issue if UDP connection became saturated
and started returning `EWOULDBLOCK` or `EAGAIN` it was treated as a
failure. Since this is more inline with the TCP code now, it will wait
on a write event to retry.
Finally there are additional cleanups due to not needing to be able to
retrieve socket errnos all over the place.
Authored-By: Brad House (@bradh352)
pkg-config uses the .private suffix for static linking to c-ares,
set the libraries and cflags necessary for this.
Authored-By: Christoph Reiter (@lazka)
Approved-By: Brad House (@bradh352)
API Level 23 (Android 6) is the recommended target and often used
by other projects, like React Native. There are a lot of custom
Android platforms that may be stuck on old versions so we want to
make sure we support such old versions still.
Authored-By: Brad House (@bradh352)
During development we were initially using MSG_FASTOPEN but then
switched to TCP_FASTOPEN_CONNECT for Linux, but we left in the
build-time dependency on just MSG_FASTOPEN which predates
TCP_FASTOPEN_CONNECT. This causes build issues on legacy platforms
like CentOS 7.
Fixes#850
Reported-By: Erik Lax (@eriklax)
On CI systems that can be overloaded, things like usleep() and select()
may not closely honor their timeouts, and can often be a multiple of the
requested timeout. Some tests, out of necessity need to rely on accurate
timing in order to test timeout conditions so this means test failures
when the skew is too large. Short of increasing timeouts to a point that
would make tests take an unreasonable amount of time, the alternative is
to make the OS honor the requested timeout more accurately.
On MacOS this means to set a realtime thread priority for the tests.
Other projects like libuv do this same thing. The code is taken from:
https://developer.apple.com/library/archive/technotes/tn2169/_index.html
Authored-By: Brad House (@bradh352)
When native thread is attached to JVM, then it's name is taken from
JavaVMAttachArgs. When no JavaVMAttachArgs, or no JavaVMAttachArgs::name
passed, then JVM on its own decides on taming thread. Those names are
not descriptive. To preserve thread name, pass the currently set thread
name in JavaVMAttachArgs::name. pthread_getname_np was introduced in API
26, hence use more generic approach.
Fixes#837
Authored By: Yauheni Khnykin (@Hsilgos)
GitHub actions supports running tests on various docker containers, move
Ubuntu 20.04 and Alpine tests to containers. Also move iOS testing to
GitHub actions since that runs on MacOS which is supported.
This should take additional load off of Cirrus-CI which consumes credits
like crazy. This leaves only FreeBSD and Linux ARM testing on Cirrus-CI.
Authored-By: Brad House (@bradh352)
Create a new data structure for a basic growable indexable array,
supporting the features you'd normally expect such as insert (at, last,
first), remove (at, last, first), and get (at, last, first). Internally
all data is stored in an appropriately sized array that can directly be
returned to the caller as a C array of the data type provided. The array
grows by powers of two, and has optimizations for head and tail
removals.
Array modifications can be risky (e.g. wrong reallocation sizes,
mis-sized memory moves, out-of-bounds access), so it makes sense to have
standardized code that is well tested.
The arrays used by the dns record parser / writer have been converted to
the new array data structure as have a few other instances.
Authored-By: Brad House (@bradh352)
TCP Fast Open (TFO) allows TCP connection establishment in 0-RTT when a
client and server have previously communicated. The SYN packet will also
contain the initial data packet from the client to the server. This
means there should be virtually no slowdown over UDP when both sides
support TCP FastOpen, which is unfortunately not always the case. For
instance, `1.1.1.1` appears to support TFO, however `8.8.8.8` does not.
This implementation supports Linux, Android, FreeBSD, MacOS, and iOS.
While Windows does have support for TCP FastOpen it does so via
completion APIs only, and that can't be used with polling APIs like used
by every other OS. We could implement it in the future if desired for
those using `ARES_OPT_EVENT_THREAD`, but it would probably require
adopting IOCP completely on Windows.
Sysctls are required to be set appropriately:
- Linux: `net.ipv4.tcp_fastopen`:
- `1` = client only (typically default)
- `2` = server only
- `3` = client and server
- MacOS: `net.inet.tcp.fastopen`
- `1` = client only
- `2` = server only
- `3` = client and server (typically default)
- FreeBSD: `net.inet.tcp.fastopen.server_enable` (boolean) and
`net.inet.tcp.fastopen.client_enable` (boolean)
This feature is always-on, when running on an OS with the capability
enabled. Though some middleboxes have impacted end-to-end TFO and caused
connectivity errors, all modern OSs perform automatic blackholing of IPs
that have issues with TFO. It is not expected this to cause any issues
in the modern day implementations.
This will also help with improving latency for future DoT and DoH
implementations.
Authored-By: Brad House (@bradh352)
Refactor some connection handling to reduce code duplication and to
unify the TCP and UDP codepaths a bit more. This will make some future
changes easier to make.
This also does some structure renaming to better conform with current
standards:
- `struct server_state` -> `ares_server_t`
- `struct server_connection` -> `ares_conn_t`
- `struct query` -> `ares_query_t`
Authored-by: Brad House (@bradh352)
When implementing ares_dns_rr_del_opt_byid() for DNS cookie support,
memory wasn't being zero'd out after removing an option, and when
ares_dns_rr_set_opt{_own}() was called, it would try to free the
address that still existed. Add a couple memset()'s to prevent
this issue.
Fixes#836
Fix By: Brad House (@bradh352)
During the implementation of server cookies, test cases were missing
to validate the server cookie in a prior reply was passed back, and
it turns out they were not.
This also adds tests for verification.
Fix By: Brad House (@bradh352)
DNS cookies are a simple form of learned mutual authentication supported
by most DNS server implementations these days and can help prevent DNS
Cache Poisoning attacks for clients and DNS amplification attacks for
servers.
Fixes#620
Fix By: Brad House (@bradh352)
When using EventThreads, the config change cleanup code might manipulate
the event update list if it uses file descriptors (such as on Linux).
This was being done without a lock. Rework the event enqueuing to handle
locking internally to prevent this and to simplify where it is used.
This was found by chance during an ASAN CI run.
Fix By: Brad House (@bradh352)