The autotools rewrite in c-ares 1.25 might have broken detection
of __system_property_get() which is needed on Android versions
prior to v8 in order to read DNS server settings.
This might fix#907
Authored-By: Brad House (@bradh352)
TSAN is warning about a thread concurrency issue that doesn't actually matter if the operation isn't atomic as its an optimization in this code path to skip timeout processing if we're shutting down due to ares_destroy().
Fix By: Jiwoo Park (@jimmy-park)
We added an optimization to stop retries on other address classes
on failures if one address class was received successfully. In
production, however, some odd misconfigured use cases could mean
an ipv6 address would be returned but the host was really only
capable of connecting to ipv4 machines.
We want to modify this optimization now to continue retries on
ipv4 even if ipv6 was received, but NOT the other way around.
It was always more likely that ipv6 resolution would cause the
delays due to system issues, as the world still really only
runs on ipv4...
Authored-By: Brad House (@bradh352)
Due to the way record duplication works, we might sometimes get a
misleading error code. Rewrite the error code to make better
sense.
Authored-By: Brad House (@bradh352)
As reported on Android 14, which recently added a fortify check for fcntl
F_SETFD to make sure only FD_CLOEXEC is ever passed:
dfe67d266c
We can't pass anything else to it. Even though on glibc, O_CLOEXEC
and FD_CLOEXEC are the same value, this isn't defined to be
portable in any way.
Fixes#900
Reported-By: Yuki San (@RealYukiSan)
Authored-By: Brad House (@bradh352)
As per Issue #883, an incorrect `libcares.pc` can be generated when
`CMAKE_THREAD_LIBS_INIT` contains a value like `-lpthread` because it
gets added to `libcares.pc` with another `-l` prefix. We can't control
the behavior of `CMAKE_THREAD_LIBS_INIT` since its set by `FIND_PACKAGE
(Threads)`.
Lets strip the `-l` prefix from the library before adding it to
`CARES_DEPENDENT_LIBS` which is used in the generation of `libcares.pc`.
Fixes#883
Reported-By: 前进,前进,进 (@leleliu008)
Fix By: Brad House (@bradh352)
Due to running containerized tests we had to cut and paste the `TEST_P`
macro definition in `googletest/include/gtest/gtest-param-test.h`, and
modify it as it isn't designed to be wrapped in any way. Unfortunately
this tends to change from release to release, usually in minor ways ...
but also google test specifically doesn't advertise its own version so
it can be hard to work around.
Lets try to fix compatibility with google test 1.15
Fixes#873
Authored-By: Brad House (@bradh352)
As per #852 searching is failing, partially it is due to the ndots value
not defaulting to a proper value on linux, and partially due to
systemd-resolved returning the wrong error codes.
This PR fixes the first issue and adds containerized test cases to
validate the behavior and prevent issues in the future.
Reported-By: Hans-Christian Egtvedt (@egtvedt) and Mikael Lindemann(@mikaellindemann)
Authored-By: Brad House (@bradh352)
If a blank DNS name is used, the DNS query cache would fail due to an
invalid sanity check. This can be legitimate such as:
adig -t SOA .
This fixes that situation as well as a few other spots that were
uncovered and adds a test case to validate the behavior to ensure
it won't regress in the future.
Fixes#858
Reported-By: Nodar Chkuaselidze (@nodech)
Authored-By: Brad House (@bradh352)
When native thread is attached to JVM, then it's name is taken from
JavaVMAttachArgs. When no JavaVMAttachArgs, or no JavaVMAttachArgs::name
passed, then JVM on its own decides on taming thread. Those names are
not descriptive. To preserve thread name, pass the currently set thread
name in JavaVMAttachArgs::name. pthread_getname_np was introduced in API
26, hence use more generic approach.
Fixes#837
Authored By: Yauheni Khnykin (@Hsilgos)
When using EventThreads, the config change cleanup code might manipulate
the event update list if it uses file descriptors (such as on Linux).
This was being done without a lock. Rework the event enqueuing to handle
locking internally to prevent this and to simplify where it is used.
This was found by chance during an ASAN CI run.
Fix By: Brad House (@bradh352)
We've been using a lot of time on Cirrus-CI and our credits run out
quickly. MacOS costs 15 compute credits vs 3 compute
credits for Linux. Move MacOS testing to GitHub Actions.
Fix By: Brad House (@bradh352)
Make ahost a dependency of adig to prevent issues with them both
referencing ares_strcasecmp.c and ares_getopt.c. This appears to be
a bug in the Cmake generator for MSVC project files.
Fixes#796
Fix By: Brad House (@bradh352)
`ares__hosts_entry_to_hostent()` would allocate a separate buffer for
each address, but `ares_free_hostent()` expects a single allocation to
hold all addresses.
This PR fixes this issue and simplifies the logic by using the
already-existing `ares__addrinfo2hostent()` to write the hostent instead
of coming up with yet another way to write the structure.
Fixes#823
Fix By: Brad House (@bradh352)
UDP is connectionless, but systems use ICMP unreachable messages to
indicate there is no ability to reach the host or port, which can result
in a `send()` returning an error like `ECONNREFUSED`. We need to handle
non-retryable codes like that to treat it as a connection failure so we
requeue any queries on that connection to another connection/server
immediately. Otherwise what happens is we just wait on the timeout to
expire which can greatly increase the time required to get a definitive
message.
This also adds a test case to verify the behavior.
Fixes#819
Fix By: Brad Houes (@bradh352)
c-ares utilizes recursion for some operations, and some of these
processes can have unintended side effects, such as if a callback
is called that then recurses into the same function. This can cause
strange cleanup conditions that lead to crashes.
Try to disassociate queries with connections as early as possible and
move cleaning up unneeded connections to its own scan rather than
trying to detect each time a query is disassociated from a connection.
Fix By: Brad House (@bradh352)
In c-ares 1.30.0 we started validating strings parsed are printable.
This caused a regression in a pycares test case due to a wrong response
code being returned as the error was being propagated from a different
section of code that was assuming the only possible failure condition
was out-of-memory.
This PR adds a fix for this and also a test case to validate it.
Ref: https://github.com/saghul/pycares/issues/200
Fix By: Brad House (@bradh352)
We've had reports of user-after-free type crashes in Windows cleanup
code for the Event Thread. In evaluating the code, it appeared there
were some memory leaks on per-connection handles that may have remained
open during shutdown, while trying to resolve that it became apparent
the methodology chosen may not have been the right one for interfacing
with the Windows AFD system as stability issues were seen during this
debugging process.
Since this system is completely undocumented, there was no clear
resolution path other than to switch to the *other* methodology which
involves directly opening `\Device\Afd`, rather than spawning a "peer
socket" to use to queue AFD operations.
The original methodology chosen more closely resembled what is employed
by [libuv](https://github.com/libuv/libuv) and given its widespread use
was the reason it was used. The new methodology more closely resembles
[wepoll](https://github.com/piscisaureus/wepoll).
Its not clear if there are any scalability or performance advantages or
disadvantages for either method. They both seem like different ways to
do the same thing, but this current way does seem more stable.
Fixes#798
Fix By: Brad House (@bradh352)