Maintained by: NLnet Labs

[Unbound-users] unbound handling of SERVFAIL response code

W.C.A. Wijngaards
Mon Apr 22 16:33:44 CEST 2013


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Brian,

On 04/22/2013 04:02 PM, Lamers, Brian J (Brian) wrote:
> Our product uses unbound DNS recursor as a simple forwarding
> interface to remote DNS servers owned by the customer. In this case
> there are two DNS servers in the customer network and the
> assumption is unbound will choose the server based on RTT
> (RoundTripTime) delay.

It prefers the server with lowest RTT.  But it uses the other servers
in case no answer can be obtained.

> Recently, our customer had some issues with one of their DNS
> servers (they were not specific), but from tcpdump output it
> appears the DNS server responded to NAPTR requests very quickly (<1
> msec) but had SERVFAIL (2) as the response code. The customer
> claims the other DNS server did not have issues but was not chosen
> (response took longer – maybe several msecs). The customer
> complained that the other server should have been selected instead
> of choosing the ‘bad’ server responses.

It selects the fast server first, then tries that.  It does not work.
 It then falls back to the slower server and gets the answer.

The scenario that you sketch sounds implausible or incomplete,
particularly the RTT values suggested.  Unbound has an rtt band of 400
msec.  Thus, the slower server has to be more than 400 msec slower
than the fast server for unbound to not consider this server (right
away).  If the fast server is 5 msec RTT, 405 msec rtt for the slow
server.  That is very, very slow.

If servers fall into the RTT band unbound picks a random server to
send to.  Thus, unbound should under normal circumstances have picked
the 'slow' server fairly often, if this server had relatively normal
RTTs (<400).

Of course during this random selection, if the fast server responds
with SERVFAIL, unbound falls back to the slow server, after a couple
of queries, regardless of the RTT to that slow server.  Unbound will
tolerate RTTs up to 2 minutes (!!) for that sort of case.

Was there something else going on?  That slow server had other
problems (malformed responses, firewalled, rtt > 2 minutes) ?

> I have seen the discussion on how unbound selects which server to
> use based on RTT but it seems like it is designed more for handling
> network connectivity issues, timeouts and such. So what is the
> expected behavior when DNS responses are received but have a
> response code other then NOERROR (particulary SERVFAIL)? Is there
> any documents or discussions on this? Is there any settings
> (configurations) which would change behavior in this case?

It will retry and once satisfied the request cannot be resolved on
this server, attempt to query other servers.  Until it finally runs
out of servers to query, and then unbound returns SERVFAIL.

If the customer wants 'immediate query resolution' (why is msec
important anyway), prehaps enable prefetch?  So that answers for
common queries are in the cache (and perhaps increase cache size)?  If
you can also control the TTL, if you increase the TTL of the answers,
it will be more likely to get prefetch treatment (even at lower
traffic density).

Best regards,
   Wouter

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJRdUpDAAoJEJ9vHC1+BF+N5XQP/3+3wkXrUjoM4uTuC8D4TNQt
xEDmtarKq7KuXKJ4G6Nodod00qF5e853Yd2eRA3u5BJ26cWGqGhOHtR4SAM9Psc8
Zkz3msCbhBDosNODrfueVQh2H9HJWsy7/wQvsgSlK2O/tRnNNvxlV/kmvV5+ralx
LV0WKHbGDB31Sfi/eSTjPHcFIgO07D3ZGmUhUtTmfzt6aPuRBGZoKl/gi+4UhdbG
QRnPmRM9H1uUxoAinHbdm3cAv2wGB+E3yMgEX1Xc6bU/KBaaxPzwL6n5LxJg7SaZ
jv9YhTgWkoUitZ9yWhyLzmepp/cRHame3U50s30nY1wyOAGxq2eOxAkIotZ24xeN
Is0HVdxcqhS4RX9xYTCQSmYpvvpSrIqn8OWqTrlBViE5gUupLialaGPHZQ+FA0VD
U+7j0LBn2XDFO+kGJdQlBcU3WlykT/HtzAUfudaPCZaEOix4kz3dwwdzUKZTOsK/
PvFU7RtcNH7C4YsEM+EDy2S5mqzLPH9CDPg2a0U3Eku2Am1my2Pxr1rjX2IBYMs2
4OK8f4gYYZJY996n5gkoSMb+WxkWtOCURBx39S5NQkXc0Ttvrmgn9cc0tJZxAyIX
DQDVeGrQVyLbbKDrNT6tBxFQcHp7eYxajnClheaZR+8AAFZiXmBxrvEyUz5ghDxg
yWpsTptLoW0DdDKa9+/f
=vBIl
-----END PGP SIGNATURE-----