In order to be sure a different server is chosen on the next query,
a read error should result in the failure count being updated
first before requeing the request to a different server.
Fix By: Brad House (@bradh352)
As of c-ares 1.22.0, server timeouts were erroneously not incrementing server failures meaning the server in use wouldn't rotate. There was apparently never a test case for this condition.
This PR fixes the bug and adds a test case to ensure it behaves properly.
Fixes Bug: #650
Fix By: Brad House (@bradh352)
c-ares does not have any concept of thread-safety. It has always been 100% up to the implementor to ensure they never call c-ares from more than one thread at a time. This patch adds basic thread-safety support, which can be disabled at compile time if not desired. It uses a single recursive mutex per channel, which should be extremely quick when uncontested so overhead should be minimal.
Fixes Bug: #610
Also sets the stage to implement #611
Fix By: Brad House (@bradh352)
For historic reasons, we have users depending on ares_set_servers_*()
to return ARES_SUCCESS when passing no servers and actually *clear*
the server list. It appears they do this for test cases to simulate
DNS unavailable or similar. Presumably they could achieve the same
effect in other ways (point to localhost on a port that isn't in use).
But it seems like this might be wide-spread enough to cause headaches
so we just will document and test for this behavior, clearly it hasn't
caused "issues" for anyone with the old behavior.
See: https://github.com/nodejs/node/pull/50800
Fix By: Brad House (@bradh352)
This PR implements a query cache at the lowest possible level, the actual dns request and response messages. Only successful and `NXDOMAIN` responses are cached. The lowest TTL in the response message determines the cache validity period for the response, and is capped at the configuration value for `qcache_max_ttl`. For `NXDOMAIN` responses, the SOA record is evaluated.
For a query to match the cache, the opcode, flags, and each question's class, type, and name are all evaluated. This is to prevent matching a cached entry for a subtly different query (such as if the RD flag is set on one request and not another).
For things like ares_getaddrinfo() or ares_search() that may spawn multiple queries, each individual message received is cached rather than the overarching response. This makes it possible for one query in the sequence to be purged from the cache while others still return cached results which means there is no chance of ever returning stale data.
We have had a lot of user requests to return TTLs on all the various parsers like `ares_parse_caa_reply()`, and likely this is because they want to implement caching mechanisms of their own, thus this PR should solve those issues as well.
Due to the internal data structures we have these days, this PR is less than 500 lines of new code.
Fixes#608
Fix By: Brad House (@bradh352)
The retry timeout values were using a fixed calculation which could cause multiple simultaneous queries to timeout and retry at the exact same time. If a DNS server is throttling requests, this could cause the issue to never self-resolve due to all requests recurring at the same instance again.
This PR also creates a maximum timeout option to make sure the random value selected does not exceed this value.
Fix By: Ignat (@Kontakter)
adig previously performed manual parsing of the DNS records. Now it can focus strictly on formatting of output data for printing. It simply iterates across the parsed DNS packet and queries for the RRs, parameters for each RR, and the datatypes for each parameter. adig will now automatically pick up new RRs from the c-ares library due to the dynamic nature.
The adig format also now more closely resembles that of BIND's `dig` output.
A few more helpers needed to be added to the c-ares library that were missing. There ware a couple of minor bugs and enhancements also needed.
Example:
```
./adig -t ANY www.google.com
; <<>> c-ares DiG 1.21.0 <<>> www.google.com
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: RCODE, id: 23913
;; flags: qr rd ra; QUERY: 1, ANSWER: 11, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: 0; udp: 512
;; QUESTION SECTION:
;www.google.com. IN ANY
;; ANSWER SECTION:
www.google.com. 162 IN A 142.251.107.99
www.google.com. 162 IN A 142.251.107.105
www.google.com. 162 IN A 142.251.107.103
www.google.com. 162 IN A 142.251.107.147
www.google.com. 162 IN A 142.251.107.104
www.google.com. 162 IN A 142.251.107.106
www.google.com. 162 IN AAAA 2607:f8b0:400c:c32::93
www.google.com. 162 IN AAAA 2607:f8b0:400c:c32::69
www.google.com. 162 IN AAAA 2607:f8b0:400c:c32::68
www.google.com. 162 IN AAAA 2607:f8b0:400c:c32::6a
www.google.com. 21462 IN HTTPS 1 . alpn="h2,h3"
;; MSG SIZE rcvd: 276
```
Fix By: Brad House (@bradh352)
The `ares_dns_record_t` data structure created in the prior release is capable of holding a complete parsed DNS message and also provides all helpers in order to fill in the data structure. This PR adds write capabilities for this data structure to form a complete message and supports features such as DNS name compression as defined in RFC1035. Though this message writing capability goes further than c-ares internally needs, external users may find it useful ... and we may find it useful for test validation as well.
This also replaces the existing message writing code in `ares_create_query()`, as well rewriting the request message without EDNS in ares_process.c's `process_answer()`.
Fix By: Brad House (@bradh352)
`ares_channel` is defined as `typedef struct ares_channeldata *ares_channel;`. The problem with this, is it embeds the pointer into the typedef, which means an `ares_channel` can never be declared as `const` as if you write `const ares_channel channel`, that expands to `struct ares_channeldata * const ares_channel` and not `const struct ares_channeldata *channel`.
We will now typedef `ares_channel_t` as `typedef struct ares_channeldata ares_channel_t;`, so if you write `const ares_channel_t *channel`, it properly expands to `const struct ares_channeldata *channel`.
We are maintaining the old typedef for API compatibility with existing integrations, and due to typedef expansion this should not even cause any compiler warnings for existing code. There are no ABI implications with this change. I could be convinced to keep existing public functions as `ares_channel` if a sufficient argument exists, but internally we really need make this change for modern best practices.
This change will allow us to internally use `const ares_channel_t *` where appropriate. Whether or not we decide to change any public interfaces to use `const` may require further discussion on if there might be ABI implications (I don't think so, but I'm also not 100% sure what a compiler internally does with `const` when emitting machine code ... I think more likely ABI implications would occur going the opposite direction).
FYI, This PR was done via a combination of sed and clang-format, the only manual code change was the addition of the new typedef, and a couple doc fixes :)
Fix By: Brad House (@bradh352)
This PR makes the server list a dynamic sorted list of servers. The sort order is [ consecutive failures, system config index ]. The server list can be updated via ares_set_servers_*(). Any queries currently directed to servers that are no longer in the list will be automatically re-queued to a different server.
Also, any time a failure occurs on the server, the sort order of the servers will be updated so that the one with the fewest consecutive failures is chosen for the next query that goes on the wire, this way bad or non-responsive servers are automatically isolated.
Since the server list is now dynamic, the tracking of query failures per server has been removed and instead is relying on the server sort order as previously described. This simplifies the logic while also reducing the amount of memory required per query. However, because of this dynamic nature, it may not be easy to determine the server attempt order for enqueued queries if there have been any failures.
If using the ARES_OPT_ROTATE, this is now implemented to be a random selection of the configured servers. Since the server list is dynamic, its not possible to go to the next server as configuration could have changed between queries or attempts for the same query.
Finally, this PR moved some existing functions into new files to logically separate them.
This should address issues #550 and #440, while also setting the framework to implement #301. #301 needs a little more effort since it configures things other than the servers themselves (domains, search, sortlist, lookups), which need to make sure they can be safely updated.
Fix By: Brad House (@bradh352)
New DNS record parsing code. The old code was basically just some helper macros and functions for parsing an entire DNS message. The caller had to know the RFCs to use the parsers, except for some pre-made exceptions. The new parsing code parses the entire DNS message into an opaque data structure in a memory safe manner with various accessors for reading and manipulating the data.
The existing parser helpers for the various record types were reimplemented as wrappers around the new parser.
The accessors allow easy iteration across the DNS record datastructure, and can be used to easily create dig-like output without needing to know anything about the various record types and formats as dynamic helpers are provided for enumeration of values and data types of those values.
At some point in the future, this new DNS record structure, accessors, and parser will be exposed publicly. This is not done at this point as we don't want to do that until the API is completely stable. Likely a write() function to output the DNS record back into an actual message buffer will be introduced with the stable API as well.
Some subtle bugs in the existing code were uncovered, some which had test cases which turned out to be bogus. Validation with third-party implementations (e.g. BIND9) were performed to validate such cases were indeed bugs.
Adding additional RR parsers such as for TLSA (#470) or SVCB/HTTPS (#566) are trivial now since focus can be put on only parsing the data within the RR, not the entire message. That said, as the new parser is not yet public, it isn't clear the best way to expose any new RRs (probably best to wait for the new parser to be public rather than hacking in another legacy function).
Some additional RRs that are part of DNS RFC1035 or EDNS RFC6891 that didn't have previously implemented parsers are now also implemented (e.g. HINFO, OPT). Any unrecognized RRs are encapsulated into a "RAW_RR" as binary data which can be inserted or extracted, but are otherwise not interpreted in any way.
Fix By: Brad House (@bradh352)
c-ares uses multiple code styles, standardize on one. Talking with @bagder he feels strongly about maintaining an 80 column limit, but feels less strongly about things I feel strongly about (like alignment).
Can re-run the formatter on the codebase via:
```
clang-format -i */*.c */*.h */*/*.c */*/*.h
```
Fix By: Brad House (@bradh352)
PR #568 increased the warning levels and c-ares code emitted a bunch of warnings. This PR fixes those warnings and starts transitioning internal data types into more proper forms (e.g. data lengths should be size_t not int). It does, however, have to manually cast back to what the public API needs due to API and ABI compliance (we aren't looking to break integrations, just clean up internals).
Fix By: Brad House (@bradh352)
c-ares currently uses int for boolean, which can be confusing as there are some functions which return int but use '0' as the success condition. Some internal variable usage is similar. Lets try to identify the boolean use cases and split them out into their own data type of ares_bool_t. Since we're trying to keep C89 compatibility, we can't rely on stdbool.h or the _Bool C99 data type, so we'll define our own.
Also, chose using an enum rather than say unsigned char or int because of the type safety benefits it provides. Compilers should warn if you try to pass, ARES_TRUE on to a ares_status_t enum (or similar) since they are different enums.
Fix By: Brad House (@bradh352)
A regression was introduced in 1.20.0 that would pass SOCK_STREAM on udp
connections due to code refactoring. If a client application validated this
data, it could cause issues as seen in gRPC.
Fixes Issue: #571
Fix By: Brad House (@bradh352)
The list of possible error codes in c-ares was a #define list. This not only doesn't provide for any sort of type safety but it also lacks clarification on what a function may return or what it takes, as an int could be an ares status, a boolean, or possibly even a length in the current code.
We are not changing any public APIs as though the C standard states the underlying size and type of an enum is int, there are compiler attributes to override this as well as compiler flags like -fshort-enums. GCC in particular is known to expand an enum's width based on the data values (e.g., it can emit a 64bit integer enum).
All internal usages should be changed by this PR, but of course, there may be some I missed.
Fix By: Brad House (@bradh352)
The purpose of this PR is to hopefully make the private API of this set of routines less likely to need to be changed in a future release. While this is not a public API, it could become harder in the future to change usage as it becomes more widely used within c-ares.
Fix By: Brad House (@bradh352)
ares (and thus c-ares) was originally licensed under the 1989 MIT license text:
https://fedoraproject.org/wiki/Licensing:MIT#Old_Style_(no_advertising_without_permission)
This change updates the license to the modern MIT license as recognized here:
https://opensource.org/license/mit/
care has been taken to ensure correct attributions remain for the authors contained within the copyright headers, and all authors with attributions in the headers have been contacted for approval regarding the change. Any authors which were not able to be contacted, the original copyright maintains, luckily that exists in only a single file `ares_parse_caa_reply.c` at this time.
Please see PR #556 for the documented approvals by each contributor.
Fix By: Brad House (@bradh352)
As per #266, TCP queries are basically broken. If we get a partial reply, things just don't work, but unlike UDP, TCP may get fragmented and we need to properly handle that.
I've started creating a basic parser/buffer framework for c-ares for memory safety reasons, but it also helps for things like this where we shouldn't be manually tracking positions and fetching only a couple of bytes at a time from a socket. This parser/buffer will be expanded and used more in the future.
This also resolves#206 by allowing NULL to be specified for some socket callbacks so they will auto-route to the built-in c-ares functions.
Fixes: #206, #266
Fix By: Brad House (@bradh352)
As per #541, when using AF_UNSPEC with ares_getaddrinfo() (and in turn with ares_gethostbynam()) if we receive a successful response for one address class, we should not allow the other address class to continue on with retries, just return the address class we have.
This will limit the overall query time to whatever timeout remains for the pending query for the other address class, it will not, however, terminate the other query as it may still prove to be successful (possibly coming in less than a millisecond later) and we'd want that result still. It just turns off additional error processing to get the result back quicker.
Fixes Bug: #541
Fix By: Brad House (@bradh352)
Add a new ARES_OPT_UDP_MAX_QUERIES option with udp_max_queries parameter that can be passed to ares_init_options(). This value defaults to 0 (unlimited) to maintain existing compatibility, any positive number will cause new UDP ephemeral ports to be created once the threshold is reached, we'll call these 'connections' even though its technically wrong for UDP.
Implementation Details:
* Each server entry in a channel now has a linked-list of connections/ports for udp and tcp. The first connection in the list is the one most likely to be eligible to accept new queries.
* Queries are now tracked by connection rather than by server.
* Every time a query is detached from a connection, the connection that it was attached to will be checked to see if it needs to be cleaned up.
* Insertion, lookup, and searching for connections has been implemented as O(1) complexity so the number of connections will not impact performance.
* Remove is_broken from the server, it appears it would be set and immediately unset, so must have been invalidated via a prior patch. A future patch should probably track consecutive server errors and de-prioritize such servers. The code right now will always try servers in the order of configuration, so a bad server in the list will always be tried and may rely on timeout logic to try the next.
* Various other cleanups to remove code duplication and for clarification.
Fixes Bug: #444
Fix By: Brad House (@bradh352)
c-ares currently lacks modern data structures that can make coding easier and more efficient. This PR implements a new linked list, skip list (sorted linked list), and hashtable implementation that are easy to use and hard to misuse. Though these implementations use more memory allocations than the prior implementation, the ability to more rapidly iterate on the codebase is a bigger win than any marginal performance difference (which is unlikely to be visible, modern systems are much more powerful than when c-ares was initially created).
The data structure implementation favors readability and audit-ability over performance, however using the algorithmically correct data type for the purpose should offset any perceived losses.
The primary motivation for this PR is to facilitate future implementation for Issues #444, #135, #458, and possibly #301
A couple additional notes:
The ares_timeout() function is now O(1) complexity instead of O(n) due to the use of a skiplist.
Some obscure bugs were uncovered which were actually being incorrectly validated in the test cases. These have been addressed in this PR but are not explicitly discussed.
Fixed some dead code warnings in ares_rand for systems that don't need rc4
Fix By: Brad House (@bradh352)
All files have their licence and copyright information clearly
identifiable. If not in the file header, they are set separately in
.reuse/dep5.
All used license texts are provided in LICENSES/
To prevent possible users having XSS issues due to intentionally malformed DNS replies, validate hostnames returned in responses and return EBADRESP if they are not valid.
It is not clear what legitimate issues this may cause at this point.
Bug Reported By: philipp.jeitner@sit.fraunhofer.de
Fix By: Brad House (@bradh352)
There is too much inconsistency between platforms for arpa/nameser.h and arpa/nameser_compat.h for the way the current files are structured. Still load the respective system files but make our private nameser.h more forgiving.
Fixes: #388
Fix By: Brad House (@bradh352)