Maintained by: NLnet Labs

[Unbound-users] Setting timers for servfail answer and request list max wait for broken pipes and lame-delegations

Dario Aguilar
Fri May 21 23:43:11 CEST 2010


Hi, we are looking an odd behavior in our public firewall after migrating to
unbound. Apparently the dns is not replying some queries to client and the
firewall has to close  the pseudo-session by timeout (set to 1 minute, the
minimum possible). This is causing to maintain a lot of useless connetions
at the firewall with our 45k qps traffic. As I could see, this problem is
happening only with broken delegations, is there any workaround to solve
this or it could be a bug or it is expected to work this way?. I would not
like to think what could happen to the firewall conn pool if we receive an
attack with all broken pipes.

1) What is the maximun time that a request (recursive lookup) stays in the
request list ? It seems to be more than 10 minutes.

# unbound-control dump_requestlist |grep winsecureservice
 90    A IN winsecureservice.com. 621.230546 iterator wait for 208.76.63.100

2) Is there any way to reduce or adjust this time? Ie. 120 o 60 seconds
maybe less?
3) On the other hand, what is the time that unbound takes before it sends a
servfail for a lame delegation to the client like in this case? Is there any
way to add a specific configuration to customize this?

# dig @127.1 winsecureservice.com
; <<>> DiG 9.7.0-P1 <<>> @127.1 winsecureservice.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
# dig @127.1 winsecureservice.com
; <<>> DiG 9.7.0-P1 <<>> @127.1 winsecureservice.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
# dig @127.1 winsecureservice.com
; <<>> DiG 9.7.0-P1 <<>> @127.1 winsecureservice.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
# dig @127.1 winsecureservice.com
; <<>> DiG 9.7.0-P1 <<>> @127.1 winsecureservice.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached


[root at est-dnsl2c-05 named]# unbound-control lookup winsecureservice.com
The following name servers are used for lookup of winsecureservice.com.
;rrset 85167 4 0 2 0
winsecureservice.com.   171567  IN      NS      ns1.everydns.net.
winsecureservice.com.   171567  IN      NS      ns2.everydns.net.
winsecureservice.com.   171567  IN      NS      ns3.everydns.net.
winsecureservice.com.   171567  IN      NS      ns4.everydns.net.
;rrset 86187 1 0 8 2
ns4.everydns.net.       86187   IN      A       208.76.60.100
;rrset 86187 1 0 8 2
ns3.everydns.net.       86187   IN      A       208.76.63.100
;rrset 86187 1 0 8 2
ns2.everydns.net.       86187   IN      A       208.76.62.100
;rrset 85166 1 0 8 2
ns1.everydns.net.       85166   IN      A       208.76.61.100
Delegation with 4 names, of which 4 can be examined to query further
addresses.
It provides 4 IP addresses.
208.76.61.100           rtt 257980 msec, 0 lost. noEDNS probed.
208.76.62.100           rtt 232028 msec, 0 lost. EDNS 0 probed.
208.76.63.100           rtt 314581 msec, 0 lost. noEDNS probed.
208.76.60.100           rtt 317964 msec, 0 lost. noEDNS probed.

 [root at est-dnsl2c-05 named]# dig @a.gtld-servers.net winsecureservice.com ns

; <<>> DiG 9.7.0-P1 <<>> @a.gtld-servers.net winsecureservice.com ns
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33380
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 4, ADDITIONAL: 4
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;winsecureservice.com.          IN      NS
;; AUTHORITY SECTION:
winsecureservice.com.   172800  IN      NS      ns1.everydns.net.
winsecureservice.com.   172800  IN      NS      ns2.everydns.net.
winsecureservice.com.   172800  IN      NS      ns3.everydns.net.
winsecureservice.com.   172800  IN      NS      ns4.everydns.net.
;; ADDITIONAL SECTION:
ns1.everydns.net.       172800  IN      A       208.76.61.100
ns2.everydns.net.       172800  IN      A       208.76.62.100
ns3.everydns.net.       172800  IN      A       208.76.63.100
ns4.everydns.net.       172800  IN      A       208.76.60.100
;; Query time: 132 msec
;; SERVER: 192.5.6.30#53(192.5.6.30)
;; WHEN: Fri May 21 17:44:21 2010
;; MSG SIZE  rcvd: 186

EFECTIVELY we are talking about a lame-delegation or broken pipi, so why
unbound didn´t reply with a SERVFAIL instead of keeping for more than 10
minutes the query in the request list and not answer the client either??

[root at est-dnsl2c-05 ~]# unbound-host winsecureservice.com
Host winsecureservice.com not found: 2(SERVFAIL).
Host winsecureservice.com not found: 2(SERVFAIL).
Host winsecureservice.com not found: 2(SERVFAIL).


[root at est-dnsl2c-05 named]# dig @208.76.6x/1/2/3.100
winsecureservice.comns
; <<>> DiG 9.7.0-P1 <<>> @208.76.60.100 winsecureservice.com ns
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached



best regards,

Dario.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unbound.nlnetlabs.nl/pipermail/unbound-users/attachments/20100521/814d2d9f/attachment.htm>