c-ares

Commit Graph

Author	SHA1	Message	Date
Gregor Jasny	a6fc828b47	Fix typo reported by Lintian (#816 ) Fix By: Gregor Jasny (@gjasny)	4 months ago
Brad House	378d26144d	DNS RR TXT strings should not be automatically concatenated (#801 ) As per #738, there are usecases where the DNS TXT record strings should not be concatenated like RFC 7208 indicates. We cannot break ABI with those using the new API, so we need to support retrieving the concatenated version as well as a new API to retrieve the individual strings which will be used by `ares_parse_text_reply_ext()` to restore the old behavior prior to c-ares 1.20. Fixes Issue: #738 Fix By: Brad House (@bradh352)	5 months ago
Brad House	70f10a85f3	DNS 0x20 implementation (#800 ) This PR enables DNS 0x20 as per https://datatracker.ietf.org/doc/html/draft-vixie-dnsext-dns0x20-00 . DNS 0x20 adds additional entropy to the request by randomly altering the case of the DNS question to help prevent cache poisoning attacks. Google DNS has implemented this support as of 2023, even though this is a proposed and expired standard from 2008: https://groups.google.com/g/public-dns-discuss/c/KxIDPOydA5M There have been documented cases of name server and caching server non-conformance, though it is expected to become more rare, especially since Google has started using this. This can be enabled via the `ARES_FLAG_DNS0x20` flag, which is currently disabled by default. The test cases do however enable this flag to validate this feature. Implementors using this flag will notice that responses will retain the mixed case, but since DNS names are case-insensitive, any proper implementation should not be impacted. There is currently no fallback mechanism implemented as it isn't immediately clear how this may affect a stub resolver like c-ares where we aren't querying the authoritative name server, but instead an intermediate recursive resolver where some domains may return invalid results while others return valid results, all while querying the same nameserver. Likely using DNS cookies as suggested by #620 is a better mechanism to fight cache poisoning attacks for stub resolvers. TCP queries do not use this feature even if the `ARES_FLAG_DNS0x20` flag is specified since they are not subject to cache poisoning attacks. Fixes Issue: #795 Fix By: Brad House (@bradh352)	5 months ago
Brad House	1dc567d9c0	fix wording in docs	5 months ago
Brad House	209e7077bb	thread safety enhancements	5 months ago
Brad House	93aa939169	thread deadlock: make sure channel lock isn't used recursively	5 months ago
Brad House	a488525f08	Automatic query timeout adjustment based on server history (#794 ) With very little effort we should be able to determine fairly proper timeouts we can use based on prior query history. We track in order to be able to auto-scale when network conditions change (e.g. maybe there is a provider failover and timings change due to that). Apple appears to do this within their system resolver in MacOS. Obviously we should have a minimum, maximum, and initial value to make sure the algorithm doesn't somehow go off the rails. Values: - Minimum Timeout: 250ms (approximate RTT half-way around the globe) - Maximum Timeout: 5000ms (Recommended timeout in RFC 1123), can be reduced by ARES_OPT_MAXTIMEOUTMS, but otherwise the bound specified by the option caps the retry timeout. - Initial Timeout: User-specified via configuration or ARES_OPT_TIMEOUTMS - Average latency multiplier: 5x (a local DNS server returning a cached value will be quicker than if it needs to recurse so we need to account for this) - Minimum Count for Average: 3. This is the minimum number of queries we need to form an average for the bucket. Per-server buckets for tracking latency over time (these are ephemeral meaning they don't persist once a channel is destroyed). We record both the current timespan for the bucket and the immediate preceding timespan in case of roll-overs we can still maintain recent metrics for calculations: - 1 minute - 15 minutes - 1 hr - 1 day - since inception Each bucket contains: - timestamp (divided by interval) - minimum latency - maximum latency - total time - count NOTE: average latency is (total time / count), we will calculate this dynamically when needed Basic algorithm for calculating timeout to use would be: - Scan from most recent bucket to least recent - Check timestamp of bucket, if doesn't match current time, continue to next bucket - Check count of bucket, if its not at least the "Minimum Count for Average", check the previous bucket, otherwise continue to next bucket - If we reached the end with no bucket match, use "Initial Timeout" - If bucket is selected, take ("total time" / count) as Average latency, multiply by "Average Latency Multiplier", bound by "Minimum Timeout" and "Maximum Timeout" NOTE: The timeout calculated may not be the timeout used. If we are retrying the query on the same server another time, then it will use a larger value On each query reply where the response is legitimate (proper response or NXDOMAIN) and not something like a server error: - Cycle through each bucket in order - Check timestamp of bucket against current timestamp, if out of date overwrite previous entry with values, clear current values - Compare current minimum and maximum recorded latency against query time and adjust if necessary - Increment "count" by 1 and "total time" by the query time Other Notes: - This is always-on, the only user-configurable value is the initial timeout which will simply re-uses the current option. - Minimum and Maximum latencies for a bucket are currently unused but are there in case we find a need for them in the future. Fixes Issue: #736 Fix By: Brad House (@bradh352)	5 months ago
Brad House	4248c642d2	Enable QueryCache by default (#786 ) The query cache should be enabled by default. This will help with determining proper timeouts for #736. It can still be disabled by setting the ttl to 0. There should be no negative consequences of this in real-world scenarios since DNS is based on the TTL concept and upstream servers will cache results and not recurse based on this information anyhow. DNS queries and responses are very small, this should have negligible impact on memory consumption. Fix By: Brad House (@bradh352)	5 months ago
Gregor Jasny	7be364571c	docs: distribute ares_dns_rec_type_tostr (#778 ) Fix By: Gregor Jasny (@gjasny)	6 months ago
Brad House	6129d9b79f	Basic support for SIG RR record (RFC 2931 / RFC 2535) (#773 ) With the current c-ares parser, as per PR #765 parsing was broken due to validation that didn't understand the `SIG` record class. This PR adds basic, non validating, and incomplete support for the `SIG` record type. The additional `KEY` and `NXT` which would be required for additional verification of the records is not implemented. It also does not store the raw unprocessed RR data that would be required for the validation. The primary purpose of this PR is to be able to recognize the record and handle some periphery aspects such as validation of the class associated with the RR and to not honor the TTL in the RR in the c-ares query cache since it will always be 0. Fixes #765 Fix By: Brad House (@bradh352)	6 months ago
Brad House	8d80486e04	Auto reload config on changes (requires EventThread) (#759 ) Automatically detect configuration changes and reload. On systems which provide notification mechanisms, use those, otherwise fallback to polling. When a system configuration change is detected, it asynchronously applies the configuration in order to ensure it is a non-blocking operation for any queries which may still be being processed. On Windows, however, changes aren't detected if a user manually sets/changes the DNS servers on an interface, it doesn't appear there is any mechanism capable of this. We are relying on `NotifyIpInterfaceChange()` for notifications. Fixes Issue: #613 Fix By: Brad House (@bradh352)	6 months ago
David Hotham	2f200b9170	const is fine on ares__channel_[un]lock (#758 ) at https://github.com/c-ares/c-ares/pull/601#issuecomment-1801935063 you chose not to scatter `const` on the public interface because of the plan - now realised - to add threading to c-ares, and in the expectation that even read operations would need to lock the mutex. But the threading implementation has a _pointer_ to a mutex inside the ares channel and as I understand it, that means that it is just fine to mark `ares__channel_lock` (and `ares__channel_unlock`) as taking a `const` channel. It is the pointed-to mutex that is not constant, but C does not propagate `const`-ness through pointers. This PR sprinkles const where appropriate on public interfaces. Fix By: David Hotham (@dimbleby)	6 months ago
Brad House	26361885be	add missing copyrights	7 months ago
Brad House	a2efab6c75	manpage: remove AUTHOR section The current best practices consider the AUTHOR section to be deprecated and recommend removing such a section.	7 months ago
Oliver Welsh	89a8856cca	Add observability into DNS server health via a server state callback, invoked whenever a query finishes (#744 ) Summary This PR adds a server state callback that is invoked whenever a query to a DNS server finishes. The callback is invoked with the server details (as a string), a boolean indicating whether the query succeeded or failed, flags describing the query (currently just indicating whether TCP or UDP was used), and custom userdata. This can be used by user applications to gain observability into DNS server health and usage. For example, alerts when a DNS server fails/recovers or metrics to track how often a DNS server is used and responds successfully. Testing Three new regression tests `MockChannelTest.ServStateCallback*` have been added to test the new callback in different success/failure scenarios. Fix By: Oliver Welsh (@oliverwelsh)	7 months ago
Oliver Welsh	fd81f36d3e	Add server failover retry behavior, where failed servers are retried with small probability after a minimum delay (#731 ) Summary By default c-ares will select the server with the least number of consecutive failures when sending a query. However, this means that if a server temporarily goes down and hits failures (e.g. a transient network issue), then that server will never be retried until all other servers hit the same number of failures. This is an issue if the failed server is preferred to other servers in the list. For example if a primary server and a backup server are configured. This PR adds new server failover retry behavior, where failed servers are retried with small probability after a minimum delay has passed. The probability and minimum delay are configurable via the `ARES_OPT_SERVER_FAILOVER` option. By default c-ares will use a probability of 10% and a minimum delay of 5 seconds. In addition, this PR includes a small change to always close out connections to servers which have hit failures, even with `ARES_FLAG_STAYOPEN`. It's possible that resetting the connection can resolve some server issues (e.g. by resetting the source port). Testing A new set of regression tests have been added to test the new server failover retry behavior. Fixes Issue: #717 Fix By: Oliver Welsh (@oliverwelsh)	7 months ago
Brad House	458c937213	Allow configuration value for NDots to be zero (#735 ) As per Issue #734 some people use `ndots:0` in their configuration which is allowed by the system resolver but not by c-ares. Add support for `ndots:0` and add a test case to validate this behavior. Fixes Issue: #734 Fix By: Brad House (@bradh352)	8 months ago
Brad House	b2139f6c79	ares_search_dnsrec() takes a const	8 months ago
Brad House	1aa0683d05	fix missing doc	8 months ago
Brad House	e862d1facc	Rework internals to pass around `ares_dns_record_t` instead of binary data (#730 ) c-ares has historically passed around raw dns packets in binary form. Now that we have a new parser, and messages are already parsed internally, lets pass around that parsed message rather than requiring multiple parse attempts on the same message. Also add a new `ares_send_dnsrec()` and `ares_query_dnsrec()` similar to `ares_search_dnsrec()` added with PR #719 that can return the pointer to the `ares_dns_record_t` to the caller enqueuing queries and rework `ares_search_dnsrec()` to use `ares_send_dnsrec()` internally. Fix By: Brad House (@bradh352)	8 months ago
Oliver Welsh	fab65acae9	Add function ares_search_dnrec() to search for records using the new DNS record parser (#719 ) This PR adds a new function `ares_search_dnsrec()` to search for records using the new DNS record parser. The function takes an arbitrary DNS record object to search (that must represent a query for a single name). The function takes a new callback type, `ares_callback_dnsrec`, that is invoked with a parsed DNS record object rather than the raw buffer(+length). The original motivation for this change is to provide support for [draft-kaplan-enum-sip-routing-04](https://datatracker.ietf.org/doc/html/draft-kaplan-enum-sip-routing-04); when routing phone calls using an ENUM server, it can be useful to include identifying source information in an OPT RR options value, to help select the appropriate route for the call. The new function allows for more customisable searches like this. Summary of code changes A new function `ares_search_dnsrec()` has been added and exposed. Moreover, the entire `ares_search_int()` internal code flow has been refactored to use parsed DNS record objects and the new DNS record parser. The DNS record object is passed through the `search_query` structure by encoding/decoding to/from a buffer (if multiple search domains are used). A helper function `ares_dns_write_query_altname()` is used to re-write the DNS record object with a new query name (used to append search domains). `ares_search()` is now a wrapper around the new internal code, where the DNS record object is created based on the name, class and type parameters. The new function uses a new callback type, `ares_callback_dnsrec`. This is invoked with a parsed DNS record object. For now, we convert from `ares_callback` to this new type using `ares__dnsrec_convert_cb()`. Some functions that are common to both `ares_query()` and `ares_search()` have been refactored using the new DNS record parser. See `ares_dns_record_create_query()` and `ares_dns_query_reply_tostatus()`. Testing A new FV has been added to test the new function, which searches for a DNS record containing an OPT RR with custom options value. As part of this, I needed to enhance the mock DNS server to expect request text (and assert that it matches actual request text). This is because the FV needs to check that the request contains the correct OPT RR. Documentation The man page docs have been updated to describe the new feature. Futures In the future, a new variant of `ares_send()` could be introduced in the same vein (`ares_send_dnsrec()`). This could be used by `ares_search_dnsrec()`. Moreover, we could migrate internal code to use `ares_callback_dnsrec` as the default callback. This will help to make the new DNS record parser the norm in C-Ares. --------- Co-authored-by: Oliver Welsh (@oliverwelsh)	8 months ago
Brad House	6a89286c7a	Remove acountry completely from code, including manpage Since acountry cannot be restored due to nerd.dk being decommissioned, we should completely remove the manpage and source. This also will resolve issue #718. Fixes Issue: #718 Fix By: Brad House (@bradh352)	9 months ago
Oliver Welsh	035c4c3776	Add flag to not use a default local named server on channel initialization (#713 ) Hello, I work on an application for Microsoft which uses c-ares to perform DNS lookups. We have made some minor changes to the library over time, and would like to contribute these back to the project in case they are useful more widely. This PR adds a new channel init flag, described below. Please let me know if I can include any more information to make this PR better/easier for you to review. Thanks! Summary When initializing a channel with `ares_init_options()`, if there are no nameservers available (because `ARES_OPT_SERVERS` is not used and `/etc/resolv.conf` is either empty or not available) then a default local named server will be added to the channel. However in some applications a local named server will never be available. In this case, all subsequent queries on the channel will fail. If we know this ahead of time, then it may be preferred to fail channel initialization directly rather than wait for the queries to fail. This gives better visibility, since we know that the failure is due to missing servers rather than something going wrong with the queries. This PR adds a new flag `ARES_FLAG_NO_DFLT_SVR`, to indicate that a default local named server should not be added to a channel in this scenario. Instead, a new error `ARES_EINITNOSERVER` is returned and initialization fails. Testing I have added 2 new FV tests: - `ContainerNoDfltSvrEmptyInit` to test that initialization fails when no nameservers are available and the flag is set. - `ContainerNoDfltSvrFullInit` to test that initialization still succeeds when the flag is set but other nameservers are available. Existing FVs are all passing. Documentation I have had a go at manually updating the docs to describe the new flag/error, but couldn't see any contributing guidance about testing this. Please let me know if you'd like anything more here. --------- Fix By: Oliver Welsh (@oliverwelsh)	9 months ago
Brad House	e5585105cd	Add ares_queue_active_queries() (#712 ) Add a function to request the number of active queries from an ares channel. This will return the number of inflight requests to dns servers. Some functions like `ares_getaddrinfo()` when using `AF_UNSPEC` may enqueue multiple queries which will be reflected in this count. In the future, if we implement support for queuing (e.g. for throttling purposes), and/or implement support for tracking user-requested queries (e.g. for cancelation), we can provide additional functions for inspecting those queues. Fix By: Brad House (@bradh352)	9 months ago
Brad House	fed3559cfc	Add ares_queue_wait_empty() for use with EventThreads (#710 ) It may be useful to wait for the queue to be empty under certain conditions (mainly test cases), expose a function to efficiently do this and rework test cases to use it. Fix By: Brad House (@bradh352)	10 months ago
Brad House	58e029f332	make docs match PR #705	10 months ago
Andriy Utkin	71b413d804	docs/ares_init_options.3: fix args in analogy (#701 ) Fix By: Andriy Utkin <hello@autkin.net>	10 months ago
Brad House	55bf3028e2	remove outdated copyright text	10 months ago
Brad House	7963c519fc	Event Subsystem: No longer require integrators to have their own (#696 ) This PR implements an event thread to process all events on file descriptors registered by c-ares. Prior to this feature, integrators were required to understand the internals of c-ares and how to monitor file descriptors and timeouts and process events. Implements OS-specific efficient polling such as epoll(), kqueue(), or IOCP, and falls back to poll() or select() if otherwise unsupported. At this point, it depends on basic threading primitives such as pthreads or windows threads. If enabled via the ARES_OPT_EVENT_THREAD option passed to ares_init_options(), then socket callbacks cannot be used. Fixes Bug: #611 Fix By: Brad House (@bradh352)	10 months ago
Erik Lax	26642c1014	Added flags to are_dns_parse to force RAW packet parsing (#693 ) This pull request adds six flags to instruct the parser under various circumstances to skip parsing of the returned RR records so the raw data can be retrieved. Fixes Bug: #686 Fix By: Erik Lax (@eriklax)	10 months ago
Brad House	95c6d70a6e	man ares_fds(3): mark as deprecated and add explanation (#691 ) ares_fds(3) is not safe to use, mark as deprecated. Fixes Issue: #687 Fix By: Brad House (@bradh352)	10 months ago
Brad House	87eb22ef26	docs: host -> ip fix mismatched documentation stating host instead of ip Fix By: Brad House (@bradh352)	11 months ago
Brad House	44c59a91c3	ahost should use ares_getaddrinfo() these days (#669 ) ahost wasn't printing both ipv4 and ipv6 addresses. This day and age, it really should. This PR also adds the ability to specify the servers to use. Fix By: Brad House (@bradh352)	11 months ago
Brad House	ae06072e88	reference alternative to ares_getsock() in docs	12 months ago
Brad House	6c91f510a0	tag some functions as deprecated in docs	12 months ago
Brad House	d974c556bb	Support ipv6 link-local servers and %iface syntax (#646 ) Some environments may send router advertisements on a link setting their link-local (fe80::/10) address as a valid DNS server to the remote system. This will cause a DNS entry to be created like `fe80::1%iface`, since all link-local network interfaces are technically part of the same /10 subnet, it must be told what interface to send packets through explicitly if there are multiple physical interfaces. This PR adds support for the %iface modifier when setting DNS servers via `/etc/resolv.conf` as well as via `ares_set_servers_csv()`. For MacOS and iOS it is assumed that libresolve will set the `sin6_scope_id` and should be supported, but my test systems don't seem to read the Router Advertisement for RDNSS link-local. Specifying the link-local dns server on MacOS via adig has been tested and confirmed working. For Windows, this is similar to MacOS in that the system doesn't seem to honor the RDNSS RA, but specifying manually has been tested to work. At this point, Android support does not exist. Fixes Bug #462 Supersedes PR #463 Fix By: Brad House (@bradh352) and Serhii Purik (@sergvpurik)	12 months ago
Brad House	c7f4804b98	add missing manpage to distribution list	1 year ago
Brad House	45698509d8	remove a simply	1 year ago
Brad House	0529f6f1dc	fix doc typo	1 year ago
Brad House	a9442bd828	Basic Thread Safety (#636 ) c-ares does not have any concept of thread-safety. It has always been 100% up to the implementor to ensure they never call c-ares from more than one thread at a time. This patch adds basic thread-safety support, which can be disabled at compile time if not desired. It uses a single recursive mutex per channel, which should be extremely quick when uncontested so overhead should be minimal. Fixes Bug: #610 Also sets the stage to implement #611 Fix By: Brad House (@bradh352)	1 year ago
Christian Clauss	054f474a29	Fix typos discovered by codespell (#634 ) % `codespell --ignore-words-list="aas,aci,acter,atleast,contentss,firey,fo,sais,seh,statics"` * https://pypi.org/project/codespell Fix By: Christian Clauss (@cclauss)	1 year ago
Brad House	320cefe1c7	ares_set_servers_() should allow an empty server list For historic reasons, we have users depending on ares_set_servers_() to return ARES_SUCCESS when passing no servers and actually clear the server list. It appears they do this for test cases to simulate DNS unavailable or similar. Presumably they could achieve the same effect in other ways (point to localhost on a port that isn't in use). But it seems like this might be wide-spread enough to cause headaches so we just will document and test for this behavior, clearly it hasn't caused "issues" for anyone with the old behavior. See: https://github.com/nodejs/node/pull/50800 Fix By: Brad House (@bradh352)	1 year ago
Brad House	4982f76a2f	Query Cache support (#625 ) This PR implements a query cache at the lowest possible level, the actual dns request and response messages. Only successful and `NXDOMAIN` responses are cached. The lowest TTL in the response message determines the cache validity period for the response, and is capped at the configuration value for `qcache_max_ttl`. For `NXDOMAIN` responses, the SOA record is evaluated. For a query to match the cache, the opcode, flags, and each question's class, type, and name are all evaluated. This is to prevent matching a cached entry for a subtly different query (such as if the RD flag is set on one request and not another). For things like ares_getaddrinfo() or ares_search() that may spawn multiple queries, each individual message received is cached rather than the overarching response. This makes it possible for one query in the sequence to be purged from the cache while others still return cached results which means there is no chance of ever returning stale data. We have had a lot of user requests to return TTLs on all the various parsers like `ares_parse_caa_reply()`, and likely this is because they want to implement caching mechanisms of their own, thus this PR should solve those issues as well. Due to the internal data structures we have these days, this PR is less than 500 lines of new code. Fixes #608 Fix By: Brad House (@bradh352)	1 year ago
Brad House	b7e5182899	fix more docs	1 year ago
Gregor Jasny	53ed62b163	Fix typos and man page whatis entry (#619 ) Those issues were detected by lintian. Fix By: Gregor Jasny (@gjasny)	1 year ago
Brad House	99f5340e3c	getaddrinfo ESERVICE	1 year ago
Brad House	5159314031	Release 1.22.0 (#616 )	1 year ago
Brad House	4acd5759e9	Slight fixes for PR #615 1. the maxtimeout must come at the end of the structure 2. fix comment form to be C style 3. fix timeplus randomness if statement	1 year ago
Brad House	cafe6b37db	Add DNS record manpages (#615 ) The new DNS record parser and writer needs manpages. This PR implements those. Fix By: Brad House (@bradh352)	1 year ago
Ignat	7a140cb478	Randomize retry penalties to prevent thundering herd type issues (#606 ) The retry timeout values were using a fixed calculation which could cause multiple simultaneous queries to timeout and retry at the exact same time. If a DNS server is throttling requests, this could cause the issue to never self-resolve due to all requests recurring at the same instance again. This PR also creates a maximum timeout option to make sure the random value selected does not exceed this value. Fix By: Ignat (@Kontakter)	1 year ago

1 2

80 Commits (8a53099184d9798624b565eba29138f910f98a18)