The DNS configuration for apple is stored in the system configuration
database. Apple does provide an emulated `/etc/resolv.conf` on MacOS
(but not iOS), it cannot, however, represent the entirety of the DNS
configuration. Alternatively, libresolv could be used to also retrieve
some system configuration, but it too is not capable of retrieving the
entirety of the DNS configuration.
Attempts to use the preferred public API of `SCDynamicStoreCreate()` and
friends yielded incomplete DNS information. Instead, that leaves some
apple "internal" symbols from `configd` that we need to access in order
to get the entire configuration. We can see that we're not the only ones
to do this as Google Chrome also does:
https://chromium.googlesource.com/chromium/src/+/HEAD/net/dns/dns_config_watcher_mac.cc
These internal functions are what what`libresolv` and `scutil` use to
retrieve the dns configuration. Since these symbols are not publicly
available, we will dynamically load the symbols from `libSystem` and
import the `dnsinfo.h` private header extracted from:
https://opensource.apple.com/source/configd/configd-1109.140.1/dnsinfo/dnsinfo.h
Fix By: Brad House (@bradh352)
**Summary**
This PR adds a server state callback that is invoked whenever a query to
a DNS server finishes.
The callback is invoked with the server details (as a string), a boolean
indicating whether the query succeeded or failed, flags describing the
query (currently just indicating whether TCP or UDP was used), and
custom userdata.
This can be used by user applications to gain observability into DNS
server health and usage. For example, alerts when a DNS server
fails/recovers or metrics to track how often a DNS server is used and
responds successfully.
**Testing**
Three new regression tests `MockChannelTest.ServStateCallback*` have
been added to test the new callback in different success/failure
scenarios.
Fix By: Oliver Welsh (@oliverwelsh)
If an invalid event thread system was provided, it would crash during cleanup due to a NULL pointer dereference.
Fixes Issue: #749
Fix By: Brad House (@bradh352)
Improve reliability in the server retry delay regression tests by
increasing the retry delay and sleeping for a little more than the retry
delay when attempting to force retries.
This helps to account for unreliable timing (e.g. NTP slew)
intermittently breaking pipelines.
Fix By: Oliver Welsh (@oliverwelsh)
I tried to build c-ares using CMake with the latest Android NDK
(r26/27), but failed as follows.
```
Building C object _deps/c-ares-source-build/src/lib/CMakeFiles/c-ares.dir/Debug/ares__buf.c.o
FAILED: _deps/c-ares-source-build/src/lib/CMakeFiles/c-ares.dir/Debug/ares__buf.c.o
In file included from c-ares/src/lib/ares__buf.c:27:
In file included from c-ares/include/ares.h:85:
In file included from Android/sdk/ndk/27.0.11718014/toolchains/llvm/prebuilt/darwin-x86_64/sysroot/usr/include/netinet/in.h:36:
In file included from Android/sdk/ndk/27.0.11718014/toolchains/llvm/prebuilt/darwin-x86_64/sysroot/usr/include/linux/in.h:231:
In file included from Android/sdk/ndk/27.0.11718014/toolchains/llvm/prebuilt/darwin-x86_64/sysroot/usr/include/aarch64-linux-android/asm/byteorder.h:12:
In file included from Android/sdk/ndk/27.0.11718014/toolchains/llvm/prebuilt/darwin-x86_64/sysroot/usr/include/linux/byteorder/little_endian.h:17:
Android/sdk/ndk/27.0.11718014/toolchains/llvm/prebuilt/darwin-x86_64/sysroot/usr/include/linux/swab.h:28:8: error: unknown type name 'inline'
28 | static inline __attribute__((__const__)) __u32 __fswahw32(__u32 val) {
| ^
Android/sdk/ndk/27.0.11718014/toolchains/llvm/prebuilt/darwin-x86_64/sysroot/usr/include/linux/swab.h:28:47: error: expected ';' after top level declarator
28 | static inline __attribute__((__const__)) __u32 __fswahw32(__u32 val) {
| ^
```
It looks like the NDK recently added C99 code containing `inline`
functions, but c-ares is setting the `C_STANDARD` CMake property to C90.
Fix By: Jiwoo Park (@jimmy-park)
There is a missing break statement in the case that timeout_ms >= 0
leading to a possible infinite loop.
Fixes Issue: #742
Fix By: Brad House (@bradh352)
**Summary**
By default c-ares will select the server with the least number of
consecutive failures when sending a query. However, this means that if a
server temporarily goes down and hits failures (e.g. a transient network
issue), then that server will never be retried until all other servers
hit the same number of failures.
This is an issue if the failed server is preferred to other servers in
the list. For example if a primary server and a backup server are
configured.
This PR adds new server failover retry behavior, where failed servers
are retried with small probability after a minimum delay has passed. The
probability and minimum delay are configurable via the
`ARES_OPT_SERVER_FAILOVER` option. By default c-ares will use a
probability of 10% and a minimum delay of 5 seconds.
In addition, this PR includes a small change to always close out
connections to servers which have hit failures, even with
`ARES_FLAG_STAYOPEN`. It's possible that resetting the connection can
resolve some server issues (e.g. by resetting the source port).
**Testing**
A new set of regression tests have been added to test the new server
failover retry behavior.
Fixes Issue: #717
Fix By: Oliver Welsh (@oliverwelsh)
Due to an error in creating the list of domains to search, if no search
domains were configured, resolution would fail.
Fixes Issue: #737
Fix By: Brad House (@bradh352)
As per Issue #734 some people use `ndots:0` in their configuration which
is allowed by the system resolver but not by c-ares. Add support for
`ndots:0` and add a test case to validate this behavior.
Fixes Issue: #734
Fix By: Brad House (@bradh352)
Multiple functions have been deprecated over the years, annotate them
with attribute deprecated.
When possible show a message about their replacements.
This is a continuation/completion of PR #706
Fix By: Cristian Rodríguez (@crrodriguez)
c-ares has historically passed around raw dns packets in binary form.
Now that we have a new parser, and messages are already parsed
internally, lets pass around that parsed message rather than requiring
multiple parse attempts on the same message. Also add a new
`ares_send_dnsrec()` and `ares_query_dnsrec()` similar to
`ares_search_dnsrec()` added with PR #719 that can return the pointer to
the `ares_dns_record_t` to the caller enqueuing queries and rework
`ares_search_dnsrec()` to use `ares_send_dnsrec()` internally.
Fix By: Brad House (@bradh352)
This PR adds a new function `ares_search_dnsrec()` to search for records
using the new DNS record parser.
The function takes an arbitrary DNS record object to search (that must
represent a query for a single name). The function takes a new callback
type, `ares_callback_dnsrec`, that is invoked with a parsed DNS record
object rather than the raw buffer(+length).
The original motivation for this change is to provide support for
[draft-kaplan-enum-sip-routing-04](https://datatracker.ietf.org/doc/html/draft-kaplan-enum-sip-routing-04);
when routing phone calls using an ENUM server, it can be useful to
include identifying source information in an OPT RR options value, to
help select the appropriate route for the call. The new function allows
for more customisable searches like this.
**Summary of code changes**
A new function `ares_search_dnsrec()` has been added and exposed.
Moreover, the entire `ares_search_int()` internal code flow has been
refactored to use parsed DNS record objects and the new DNS record
parser. The DNS record object is passed through the `search_query`
structure by encoding/decoding to/from a buffer (if multiple search
domains are used). A helper function `ares_dns_write_query_altname()` is
used to re-write the DNS record object with a new query name (used to
append search domains).
`ares_search()` is now a wrapper around the new internal code, where the
DNS record object is created based on the name, class and type
parameters.
The new function uses a new callback type, `ares_callback_dnsrec`. This
is invoked with a parsed DNS record object. For now, we convert from
`ares_callback` to this new type using `ares__dnsrec_convert_cb()`.
Some functions that are common to both `ares_query()` and
`ares_search()` have been refactored using the new DNS record parser.
See `ares_dns_record_create_query()` and
`ares_dns_query_reply_tostatus()`.
**Testing**
A new FV has been added to test the new function, which searches for a
DNS record containing an OPT RR with custom options value.
As part of this, I needed to enhance the mock DNS server to expect
request text (and assert that it matches actual request text). This is
because the FV needs to check that the request contains the correct OPT
RR.
**Documentation**
The man page docs have been updated to describe the new feature.
**Futures**
In the future, a new variant of `ares_send()` could be introduced in the
same vein (`ares_send_dnsrec()`). This could be used by
`ares_search_dnsrec()`. Moreover, we could migrate internal code to use
`ares_callback_dnsrec` as the default callback.
This will help to make the new DNS record parser the norm in C-Ares.
---------
Co-authored-by: Oliver Welsh (@oliverwelsh)
Rewrite configuration parsers using new memory safe parsing functions.
After CVE-2024-25629 its obvious that we need to prioritize again on
getting all the hand written parsers with direct pointer manipulation
replaced. They're just not safe and hard to audit. It was yet another
example of 20+yr old code having a memory safety issue just now coming
to light.
Though these parsers are definitely less efficient, they're written with
memory safety in mind, and any performance difference is going to be
meaningless for something that only happens once a while.
Fix By: Brad House (@bradh352)
If initializing using default settings fails, there may be a memory leak of
search domains that were set by system configuration.
Fixes Issue: #724
Fix By: Brad House (@bradh352)
Since acountry cannot be restored due to nerd.dk being decommissioned,
we should completely remove the manpage and source. This also
will resolve issue #718.
Fixes Issue: #718
Fix By: Brad House (@bradh352)
Hello, I work on an application for Microsoft which uses c-ares to
perform DNS lookups. We have made some minor changes to the library over
time, and would like to contribute these back to the project in case
they are useful more widely. This PR adds a new channel init flag,
described below.
Please let me know if I can include any more information to make this PR
better/easier for you to review. Thanks!
**Summary**
When initializing a channel with `ares_init_options()`, if there are no
nameservers available (because `ARES_OPT_SERVERS` is not used and
`/etc/resolv.conf` is either empty or not available) then a default
local named server will be added to the channel.
However in some applications a local named server will never be
available. In this case, all subsequent queries on the channel will
fail.
If we know this ahead of time, then it may be preferred to fail channel
initialization directly rather than wait for the queries to fail. This
gives better visibility, since we know that the failure is due to
missing servers rather than something going wrong with the queries.
This PR adds a new flag `ARES_FLAG_NO_DFLT_SVR`, to indicate that a
default local named server should not be added to a channel in this
scenario. Instead, a new error `ARES_EINITNOSERVER` is returned and
initialization fails.
**Testing**
I have added 2 new FV tests:
- `ContainerNoDfltSvrEmptyInit` to test that initialization fails when
no nameservers are available and the flag is set.
- `ContainerNoDfltSvrFullInit` to test that initialization still
succeeds when the flag is set but other nameservers are available.
Existing FVs are all passing.
**Documentation**
I have had a go at manually updating the docs to describe the new
flag/error, but couldn't see any contributing guidance about testing
this. Please let me know if you'd like anything more here.
---------
Fix By: Oliver Welsh (@oliverwelsh)
Add a function to request the number of active queries from an ares
channel. This will return the number of inflight requests to dns
servers. Some functions like `ares_getaddrinfo()` when using `AF_UNSPEC`
may enqueue multiple queries which will be reflected in this count.
In the future, if we implement support for queuing (e.g. for throttling
purposes), and/or implement support for tracking user-requested queries
(e.g. for cancelation), we can provide additional functions for
inspecting those queues.
Fix By: Brad House (@bradh352)