The test framework was using 100ms timeout passed to select(), and not using ares_timeout() to calculate the actual recommended value based on the queries in queue. Using ares_timeout() tests the functionality of ares_timeout() itself and will provide more responsive results.
Fix By: Brad House (@bradh352)
As per #266, TCP queries are basically broken. If we get a partial reply, things just don't work, but unlike UDP, TCP may get fragmented and we need to properly handle that.
I've started creating a basic parser/buffer framework for c-ares for memory safety reasons, but it also helps for things like this where we shouldn't be manually tracking positions and fetching only a couple of bytes at a time from a socket. This parser/buffer will be expanded and used more in the future.
This also resolves#206 by allowing NULL to be specified for some socket callbacks so they will auto-route to the built-in c-ares functions.
Fixes: #206, #266
Fix By: Brad House (@bradh352)
As per #541, when using AF_UNSPEC with ares_getaddrinfo() (and in turn with ares_gethostbynam()) if we receive a successful response for one address class, we should not allow the other address class to continue on with retries, just return the address class we have.
This will limit the overall query time to whatever timeout remains for the pending query for the other address class, it will not, however, terminate the other query as it may still prove to be successful (possibly coming in less than a millisecond later) and we'd want that result still. It just turns off additional error processing to get the result back quicker.
Fixes Bug: #541
Fix By: Brad House (@bradh352)
Add a new ARES_OPT_UDP_MAX_QUERIES option with udp_max_queries parameter that can be passed to ares_init_options(). This value defaults to 0 (unlimited) to maintain existing compatibility, any positive number will cause new UDP ephemeral ports to be created once the threshold is reached, we'll call these 'connections' even though its technically wrong for UDP.
Implementation Details:
* Each server entry in a channel now has a linked-list of connections/ports for udp and tcp. The first connection in the list is the one most likely to be eligible to accept new queries.
* Queries are now tracked by connection rather than by server.
* Every time a query is detached from a connection, the connection that it was attached to will be checked to see if it needs to be cleaned up.
* Insertion, lookup, and searching for connections has been implemented as O(1) complexity so the number of connections will not impact performance.
* Remove is_broken from the server, it appears it would be set and immediately unset, so must have been invalidated via a prior patch. A future patch should probably track consecutive server errors and de-prioritize such servers. The code right now will always try servers in the order of configuration, so a bad server in the list will always be tried and may rely on timeout logic to try the next.
* Various other cleanups to remove code duplication and for clarification.
Fixes Bug: #444
Fix By: Brad House (@bradh352)
c-ares currently lacks modern data structures that can make coding easier and more efficient. This PR implements a new linked list, skip list (sorted linked list), and hashtable implementation that are easy to use and hard to misuse. Though these implementations use more memory allocations than the prior implementation, the ability to more rapidly iterate on the codebase is a bigger win than any marginal performance difference (which is unlikely to be visible, modern systems are much more powerful than when c-ares was initially created).
The data structure implementation favors readability and audit-ability over performance, however using the algorithmically correct data type for the purpose should offset any perceived losses.
The primary motivation for this PR is to facilitate future implementation for Issues #444, #135, #458, and possibly #301
A couple additional notes:
The ares_timeout() function is now O(1) complexity instead of O(n) due to the use of a skiplist.
Some obscure bugs were uncovered which were actually being incorrectly validated in the test cases. These have been addressed in this PR but are not explicitly discussed.
Fixed some dead code warnings in ares_rand for systems that don't need rc4
Fix By: Brad House (@bradh352)
All files have their licence and copyright information clearly
identifiable. If not in the file header, they are set separately in
.reuse/dep5.
All used license texts are provided in LICENSES/
* Merged latest OpenBSD changes for inet_net_pton_ipv6() into c-ares.
* Always use our own IP conversion functions now, do not delegate to OS
so we can have consistency in testing and fuzzing.
* Removed bogus test cases that never should have passed.
* Add new test case for crash bug found.
Fix By: Brad House (@bradh352)
RFC6761 6.3 states:
The domain "localhost." and any names falling within ".localhost."
We were only honoring "localhost".
Fixes: #477
Fix By: Brad House (@bradh352)
In ares_set_sortlist, it calls config_sortlist(..., sortstr) to parse
the input str and initialize a sortlist configuration.
However, ares_set_sortlist has not any checks about the validity of the input str.
It is very easy to create an arbitrary length stack overflow with the unchecked
`memcpy(ipbuf, str, q-str);` and `memcpy(ipbufpfx, str, q-str);`
statements in the config_sortlist call, which could potentially cause severe
security impact in practical programs.
This commit add necessary check for `ipbuf` and `ipbufpfx` which avoid the
potential stack overflows.
fixes#496
Fix By: @hopper-vul
* add ares_strsplit unit test
The test reveals a bug in the implementation of ares_strsplit when the
make_set parameter is set to 1, as distinct domains are confused for
equal:
out = ares_strsplit("example.com, example.co", ", ", 1, &n);
evaluates to n = 1 with out = { "example.com" }.
* bugfix and cleanup of ares_strsplit
The purpose of ares_strsplit in c-ares is to split a comma-delimited
string of unique (up to letter case) domains. However, because the
terminating NUL byte was not checked in the substrings when comparing
for uniqueness, the function would sometimes drop domains it should
not. For example,
ares_strsplit("example.com, example.co", ",")
would only result in a single domain "example.com".
Aside from this bugfix, the following cleanup is performed:
1. The tokenization now happens with the help of strcspn instead of the
custom function is_delim.
2. The function list_contains has been inlined.
3. The interface of ares_strsplit has been simplified by removing the
parameter make_set since in practice it was always 1.
4. There are fewer passes over the input string.
5. We resize the table using realloc() down to its minimum size.
6. The docstring of ares_strsplit is updated and also a couple typos
are fixed.
There occurs a single use of ares_strsplit and since the make_set
parameter has been removed, the call in ares_init.c is modified
accordingly. The unit test for ares_strsplit is also updated.
Fix By: Nikolaos Chatzikonstantinou (@createyourpersonalaccount)
ai_addrlen was erroneously returning 16 bytes instead of the
sizeof(struct sockaddr_in6). This is a regression introduced
in 1.18.0.
Reported by: James Brown <jbrown@easypost.com>
Fix By: Brad House (@bradh352)
As per RFC6761 Section 6.3, "localhost" lookups need to be special cased to return loopback addresses, and not forward queries to recursive dns servers.
We first look up via files (/etc/hosts or equivalent), and if that fails, we then attempt a system-specific address enumeration for loopback addresses (currently Windows-only), and finally fallback to ::1 and 127.0.0.1.
Fix By: Brad House (@bradh352)
Fixes Bug: #399
ares_gethostbyname() and ares_getaddrinfo() do a lot of similar things, however ares_getaddrinfo() has some desirable behaviors that should be imported into ares_gethostbyname(). For one, it sorts the address lists for the most likely to succeed based on the current system routes. Next, when AF_UNSPEC is specified, it properly handles search lists instead of first searching all of AF_INET6 then AF_INET, since ares_gethostbyname() searches in parallel. Therefore, this PR should also resolve the issues attempted in #94.
A few things this PR does:
1. ares_parse_a_reply() and ares_parse_aaaa_reply() had very similar code to translate struct ares_addrinfo into a struct hostent as well as into struct ares_addrttl/ares_addr6ttl this has been split out into helper functions of ares__addrinfo2hostent() and ares__addrinfo2addrttl() to prevent this duplicative code.
2. ares_getaddrinfo() was apparently never honoring HOSTALIASES, and this was discovered once ares_gethostbyname() was turned into a wrapper, the affected test cases started failing.
3. A slight API modification to save the query hostname into struct ares_addrinfo as the last element of name. Since this is the last element, and all user-level instances of struct ares_addrinfo are allocated internally by c-ares, this is not an ABI-breaking change nor would it impact any API compatibility. This was needed since struct hostent has an h_name element.
4. Test Framework: MockServer tests via TCP would fail if more than 1 request was received at a time which is common when ares_getaddrinfo() queries for both A and AAAA records simultaneously. Infact, this was a long standing issue in which the ares_getaddrinfo() test were bypassing TCP alltogether. This has been corrected, the message is now processed in a loop.
5. Some tests had to be updated for overall correctness as they were invalid but somehow passing prior to this change.
Change By: Brad House (@bradh352)
ax_code_coverage.m4 dropped the @CODE_COVERAGE_RULES@ macro, so we need to switch to the latest recommendation from the m4 file. This requires updates to Makefile.am.
Fix By: Felix Yan (@felixonmars)
To prevent possible users having XSS issues due to intentionally malformed DNS replies, validate hostnames returned in responses and return EBADRESP if they are not valid.
It is not clear what legitimate issues this may cause at this point.
Bug Reported By: philipp.jeitner@sit.fraunhofer.de
Fix By: Brad House (@bradh352)
There is too much inconsistency between platforms for arpa/nameser.h and arpa/nameser_compat.h for the way the current files are structured. Still load the respective system files but make our private nameser.h more forgiving.
Fixes: #388
Fix By: Brad House (@bradh352)
Some systems may return either NULL or a valid pointer on malloc(0). c-ares should never call malloc(0) so lets return NULL so we're more likely to find an issue if it were to occur.
It appears that when building tests, it would hardcode enabling building
of the c-ares static library. This was probably due to Windows limitations
in symbol visibility.
This change will use the static library if it exists for tests, always.
Otherwise, it will only forcibly enable static libraries for tests on
Windows.
Fixes: #380
Fix By: Brad House (@bradh352)
The fuzz test files were not being distributed. This doesn't appear to be
a regression, it looks like they have never been distributed.
Fixes: #379
Fix By: Brad House (@bradh352)
CAA (Certification Authority Authorization) was introduced in RFC 6844.
This has been obsoleted by RFC 8659. This commit added the possibility
to query CAA resource records with adig and adds a parser for CAA
records, that can be used in conjunction with ares_query(3).
Closes Bug: #292
Fix By: Daniela Sonnenschein (@lxdicted)
Originally started by Daniel Stenberg (@bagder) with #123, this patch reorganizes the c-ares source tree to have a more modern layout. It also fixes out of tree builds for autotools, and automatically builds the tests if tests are enabled. All tests are passing which tests each of the supported build systems (autotools, cmake, nmake, mingw gmake). There may be some edge cases that will have to be caught later on for things I'm not aware of.
Fix By: Brad House (@bradh352)
EDNS retry should be based on FORMERR returned without an OPT RR record as per https://tools.ietf.org/html/rfc6891#section-7 rather than just treating any unexpected error condition as a reason to disable EDNS on the channel.
Fix By: Erik Lax (@eriklax)
The c-ares test suite was hardcoded to use port 5300 (and possibly 5301, 5302) for the test suite. Especially in containers, there may be no guarantee these ports are available and cause tests to fail when they could otherwise succeed. Instead, request the system to assign a port to use dynamically. This is now the default. To override, the test suite still takes the "-p <port>" option as it always has and will honor that.
Fix By: Anthony Penniston (@apenn-msft)
If a query is performed for dynamodb.us-east-1.amazonaws.com. with ndots=5, it was attempting to search the search domains rather than just attempting the FQDN that was passed it. This patch now at least attempts the FQDN first.
We may need to determine if we should abort any further searching, however as is probably intended.
Fix by: Jonathan Maye-Hobbs (@wheelpharoah)