Some socket functions weren't exposed for use by other areas of the library. Expose
those and make use of them in ares__sortaddrinfo().
Fix By: Andrew Selivanov (@ki11roy)
* Harden and rationalize c-ares timeout computation
* Remove the rand() part of the timeout calculation completely.
When c-ares sends a DNS query, it computes the timeout for that request as follows:
timeplus = channel->timeout << (query->try_count / channel->nservers);
timeplus = (timeplus * (9 + (rand () & 7))) / 16;
I see two issues with this code. Firstly, when either try_count or channel->timeout are large enough, this can end up as an illegal shift.
Secondly, the algorithm for adding the random timeout (added in 2009) is surprising. The original commit that introduced this algorithm says it was done to avoid a "packet storm". But, the algorithm appears to only reduce the timeout by an amount proportional to the scaled timeout's magnitude. It isn't clear to me that, for example, cutting a 30 second timeout almost in half to roughly 17 seconds is appropriate. Even with the default timeout of 5000 ms, this algorithm computes values between 2812 ms and 5000 ms, which is enough to cause a slightly latent DNS response to get spuriously dropped.
If preventing the timers from all expiring at the same time really is desirable, then it seems better to extend the timeout by a small factor so that the application gets at least the timeout it asked for, and maybe a little more. In my experience, this is common practice for timeouts: applications expect that a timeout will happen at or after the designated time (but not before), allowing for delay in detecting and reporting the timeout. Furthermore, it seems like the timeout shouldn't be extended by very much (we don't want a 30 second timeout changing into a 45 second timeout, either).
Consider also the documentation of channel->timeout in ares_init_options():
The number of milliseconds each name server is given to respond to a query on the first try. (After the first try, the timeout algorithm becomes more complicated, but scales linearly with the value of timeout.) The default is five seconds.
In the current implementation, even the first try does not use the value that the user supplies; it will use anywhere between 56% and 100% of that value.
The attached patch attempts to address all of these concerns without trying to make the algorithm much more sophisticated. After performing a safe shift, this patch simply adds a small random timeout to the computed value of between 0 ms and 511 ms. I could see limiting the random amount to be no greater than a proportion of the configured magnitude, but I can't see scaling the random with the overall computed timeout. As far as I understand, the goal is just to schedule retries "not at the same exact time", so a small difference seems sufficient.
UPDATE: randomization removed.
Closes PR #187
Fix by: Brad Spencer
ares_process.c uses htonl, which needs <arpa/inet.h> included.
ares_getnameinfo.c uses a dynamically selected format string for
sprintf, which -Wformat-literal doesn't like. Usually one would use
inttypes.h and a format string "%" PRIu32, but C99 is too new for some
supported platforms.
Socklen_t should not be used in code, instead ares_socklen_t should be used.
Convert ssize_t to ares_ssize_t for portability since the public API now exposes this.
Uses virtual socket IO functions when set on a channel.
Note that no socket options are set, nor is any binding
done by the library in this case, since the client defining
these is probably more suited to deal with this.
This function sets a callback that is invoked after the socket is
created, but before the connection is established. This is an ideal
time to customize various socket options.
Add user-visible entrypoints ares_{get,set}_servers_ports(3), which
take struct ares_addr_port_node rather than struct ares_addr_node.
This structure includes a UDP and TCP port number; if this is set
to zero, the channel-wide port values are used as before.
Similarly, add a new ares_set_servers_ports_csv(3) entrypoint, which
is analogous to ares_set_servers(3) except it doesn't ignore any
specified port information; instead, any per-server specified port
is used as both the UDP and TCP port for that server.
The internal struct ares_addr is extended to hold the UDP/TCP ports,
stored in network order, with the convention that a value of zero
indicates that the channel-wide UDP/TCP port should be used.
For the internal implementation of ares_dup(3), shift to use the
_ports() version of the get/set functions, so port information is
transferred correctly to the new channel.
Update manpages, and add missing ares_set_servers_csv to the lists
while we're at it
Add a new ares_library_init_mem() initialization function for the
library which allows the library user to specify their own malloc,
realloc & free equivalents for use library-wide.
Store these function pointers in library-wide global variables,
defaulting to libc's malloc(), realloc() and free().
Change all calls to malloc, realloc and free to use the function pointer
instead. Also ensure that ares_strdup() is always available
(even if the local environment includes strdup(3)), and change the
library code to always use it.
Convert calls to calloc() to use ares_malloc() + memset
Add comments for the benefit of the lcov tool, marking
lines that cannot be hit. Typically these are fall-back
protection arms that are already covered by earlier checks,
and so it's not worth taking out the unhittable code (in case
someone changes the code between the two places in future).
When a server rejects an EDNS-equipped request, we retry without
the EDNS option. However, in TCP mode, the 2-byte length prefix was
being calculated wrong -- it was built from the answer length rather than
the length of the original request.
Also, it is theoretically possible that the call to realloc() might change
the data pointed to; to allow for this, qbuf also needs updating.
(Both these fixes were actually included in a patchset sent on the mailing
list in Oct 2012, but were included with other functional changes that
didn't get merged:
http://c-ares.haxx.se/mail/c-ares-archive-2012-10/0004.shtml)
CID 56884, pointed out by Coverity. We really should make this function
return an error code so that a malloc() failure can return back a major
failure.
I can see that recvfrom() in ares_process.c many times is called with
'udp_socket' == ARES_SOCKET_BAD. The code takes care not to call
recv/recvfrom with ARES_SOCKET_BAD in the outer-loop. So should the
inner-loop.
07bc7ea7953392a50ea39912637d32
The purpose of the whole patch was to silence a compiler warning triggered
with GCC 4 on file ares_process.c The specific compiler warning was
'dereferencing type-punned pointer might break strict-aliasing rules'.
A simpler patch will follow to equally silence the warning.
AIX, at least, does not have sockaddr_storage.ss_family member.
Detect this in the configure logic and use proper #ifdefs in the
ares_process logic.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Tor Arntsen <tor@spacetec.no>
Add 3 new functions to set the local binding for the out-going
socket connection, and add ares_set_servers_csv() to set a
list of servers at once as a comma-separated string.
Signed-off-by: Ben Greear <greearb@candelatech.com>