Maintained by: NLnet Labs

[Unbound-users] Validation failure of DNSSEC signed domain names

Zbynek Michl
Thu Apr 29 15:11:19 CEST 2010


Hello,

I am experiencing a problem with validation of signed domain names. When I start 
Unbound, validation works fine. But when I put resolver under extreme load 
(thousands queries per second), I get SERVFAIL reply on every correctly signed 
domain query after some time.

It is probably caching bug, because if I restart Unbound or wait some time, then 
Unbound responses correctly again.

Here is log of validating A type query for www.nic.cz when Unbound is in the 
"strange" state.

--- LOG ---
[1272541060] unbound[18734:1] debug: answer from the cache failed
[1272541060] unbound[18734:1] debug: udp request from ip4 127.0.0.1 port 45681 
(len 16)
[1272541060] unbound[18734:1] debug: mesh_run: start
[1272541060] unbound[18734:1] debug: validator[module 0] operate: 
extstate:module_state_initial event:module_event_new
[1272541060] unbound[18734:1] info: validator operate: query <www.nic.cz. A IN>
[1272541060] unbound[18734:1] debug: validator: pass to next module
[1272541060] unbound[18734:1] debug: mesh_run: validator module exit state is 
module_wait_module
[1272541060] unbound[18734:1] debug: iterator[module 1] operate: 
extstate:module_state_initial event:module_event_pass
[1272541060] unbound[18734:1] debug: process_request: new external request event
[1272541060] unbound[18734:1] debug: iter_handle processing q with state INIT 
REQUEST STATE
[1272541060] unbound[18734:1] info: resolving <www.nic.cz. A IN>
[1272541060] unbound[18734:1] debug: request has dependency depth of 0
[1272541060] unbound[18734:1] debug: cache delegation returns delegpt
[1272541060] unbound[18734:1] info: DelegationPoint<nic.cz.>: 3 names (0 
missing), 6 addrs (0 result, 6 avail)
[1272541060] unbound[18734:1] info:   a.ns.nic.cz. * A AAAA
[1272541060] unbound[18734:1] info:   b.ns.nic.cz. * A AAAA
[1272541060] unbound[18734:1] info:   d.ns.nic.cz. * A AAAA
[1272541060] unbound[18734:1] debug:    ip6 2001:678:1::1 port 53 (len 28)
[1272541060] unbound[18734:1] debug:    ip4 193.29.206.1 port 53 (len 16)
[1272541060] unbound[18734:1] debug:    ip6 2001:678:10::1 port 53 (len 28)
[1272541060] unbound[18734:1] debug:    ip4 194.0.13.1 port 53 (len 16)
[1272541060] unbound[18734:1] debug:    ip6 2001:678:f::1 port 53 (len 28)
[1272541060] unbound[18734:1] debug:    ip4 194.0.12.1 port 53 (len 16)
[1272541060] unbound[18734:1] debug: iter_handle processing q with state INIT 
REQUEST STATE (stage 2)
[1272541060] unbound[18734:1] info: resolving (init part 2):  <www.nic.cz. A IN>
[1272541060] unbound[18734:1] debug: iter_handle processing q with state INIT 
REQUEST STATE (stage 3)
[1272541060] unbound[18734:1] info: resolving (init part 3):  <www.nic.cz. A IN>
[1272541060] unbound[18734:1] debug: iter_handle processing q with state QUERY 
TARGETS STATE
[1272541060] unbound[18734:1] info: processQueryTargets: <www.nic.cz. A IN>
[1272541060] unbound[18734:1] debug: processQueryTargets: targetqueries 0, 
currentqueries 0
[1272541060] unbound[18734:1] info: DelegationPoint<nic.cz.>: 3 names (0 
missing), 6 addrs (0 result, 6 avail)
[1272541060] unbound[18734:1] info:   a.ns.nic.cz. * A AAAA
[1272541060] unbound[18734:1] info:   b.ns.nic.cz. * A AAAA
[1272541060] unbound[18734:1] info:   d.ns.nic.cz. * A AAAA
[1272541060] unbound[18734:1] debug:    ip6 2001:678:1::1 port 53 (len 28)
[1272541060] unbound[18734:1] debug:    ip4 193.29.206.1 port 53 (len 16)
[1272541060] unbound[18734:1] debug:    ip6 2001:678:10::1 port 53 (len 28)
[1272541060] unbound[18734:1] debug:    ip4 194.0.13.1 port 53 (len 16)
[1272541060] unbound[18734:1] debug:    ip6 2001:678:f::1 port 53 (len 28)
[1272541060] unbound[18734:1] debug:    ip4 194.0.12.1 port 53 (len 16)
[1272541060] unbound[18734:1] debug: attempt to get extra 3 targets
[1272541060] unbound[18734:1] debug: servselect ip4 194.0.12.1 port 53 (len 16)
[1272541060] unbound[18734:1] debug:    rtt=8469 lost=0
[1272541060] unbound[18734:1] debug: servselect ip4 194.0.13.1 port 53 (len 16)
[1272541060] unbound[18734:1] debug:    rtt=8389 lost=0
[1272541060] unbound[18734:1] debug: servselect ip4 193.29.206.1 port 53 (len 16)
[1272541060] unbound[18734:1] debug:    rtt=60 lost=0
[1272541060] unbound[18734:1] debug: selrtt 60
[1272541060] unbound[18734:1] info: sending query: <www.nic.cz. A IN>
[1272541060] unbound[18734:1] debug: sending to target: <nic.cz.> 193.29.206.1#53
[1272541060] unbound[18734:1] debug: serviced query UDP timeout=60 msec
[1272541060] unbound[18734:1] debug: inserted new pending reply id=e60a
[1272541060] unbound[18734:1] debug: opened UDP if=0 port=8457
[1272541060] unbound[18734:1] debug: comm point start listening 19
[1272541060] unbound[18734:1] debug: mesh_run: iterator module exit state is 
module_wait_reply
[1272541060] unbound[18734:1] info: mesh_run: end 1 recursion states (1 with 
reply, 0 detached), 1 waiting replies, 4717 recursion replies sent, 0 replies 
dropped, 0 states jostled out
[1272541060] unbound[18734:1] info: average recursion processing time 52.429515 sec
[1272541060] unbound[18734:1] info: histogram of recursion processing times
[1272541060] unbound[18734:1] info: [25%]=10.3537 median[50%]=24.2551 [75%]=26.5821
[1272541060] unbound[18734:1] info: lower(secs) upper(secs) recursions
[1272541060] unbound[18734:1] info:    0.000000    0.000001 1
[1272541060] unbound[18734:1] info:    0.004096    0.008192 5
[1272541060] unbound[18734:1] info:    0.008192    0.016384 4
[1272541060] unbound[18734:1] info:    0.016384    0.032768 2
[1272541060] unbound[18734:1] info:    0.032768    0.065536 7
[1272541060] unbound[18734:1] info:    0.065536    0.131072 3
[1272541060] unbound[18734:1] info:    0.131072    0.262144 8
[1272541060] unbound[18734:1] info:    0.262144    0.524288 19
[1272541060] unbound[18734:1] info:    0.524288    1.000000 84
[1272541060] unbound[18734:1] info:    1.000000    2.000000 128
[1272541060] unbound[18734:1] info:    2.000000    4.000000 144
[1272541060] unbound[18734:1] info:    4.000000    8.000000 366
[1272541060] unbound[18734:1] info:    8.000000   16.000000 207
[1272541060] unbound[18734:1] info:   16.000000   32.000000 311
[1272541060] unbound[18734:1] info:   32.000000   64.000000 1411
[1272541060] unbound[18734:1] info:   64.000000  128.000000 2017
[1272541060] unbound[18734:1] info: 0RDd mod1 rep <www.nic.cz. A IN>
[1272541060] unbound[18734:1] debug: cache memory msg=4456435 rrset=4455263 
infra=36173 val=4267600
[1272541060] unbound[18734:1] debug: answer cb
[1272541060] unbound[18734:1] debug: Incoming reply id = e60a
[1272541060] unbound[18734:1] debug: Incoming reply addr = ip4 193.29.206.1 port 
53 (len 16)
[1272541060] unbound[18734:1] debug: lookup size is 1 entries
[1272541060] unbound[18734:1] debug: received udp reply.
[1272541060] unbound[18734:1] debug: udp message[227:0] 
E60A8400000100010003000603777777036E696302637A0000010001C00C00010001000007080004D91FCD32C010000200010000070800070164026E73C010C010000200010000070800040161C03AC010000200010000070800040162C03AC04B00010001000007080004C2000C01C04B001C000100000708001020010678000F00000000000000000001C05B00010001000007080004C2000D01C05B001C000100000708001020010678001000000000000000000001C03800010001000007080004C11DCE01C038001C000100000708001020010678000100000000000000000001
[1272541060] unbound[18734:1] debug: outnet handle udp reply
[1272541060] unbound[18734:1] debug: measured roundtrip at 1 msec
[1272541060] unbound[18734:1] debug: svcd callbacks start
[1272541060] unbound[18734:1] debug: worker svcd callback for qstate 0x7f4c98ba1f20
[1272541060] unbound[18734:1] debug: mesh_run: start
[1272541060] unbound[18734:1] debug: iterator[module 1] operate: 
extstate:module_wait_reply event:module_event_reply
[1272541060] unbound[18734:1] info: iterator operate: query <www.nic.cz. A IN>
[1272541060] unbound[18734:1] debug: process_response: new external response event
[1272541060] unbound[18734:1] info: scrub for <nic.cz. NS IN>
[1272541060] unbound[18734:1] info: response for <www.nic.cz. A IN>
[1272541060] unbound[18734:1] info: reply from <nic.cz.> 193.29.206.1#53
[1272541060] unbound[18734:1] info: incoming scrubbed packet: ;; ->>HEADER<<- 
opcode: QUERY, rcode: NOERROR, id: 0
;; flags: qr aa ; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 6
;; QUESTION SECTION:
;; www.nic.cz.	IN	A

;; ANSWER SECTION:
www.nic.cz.	1800	IN	A	217.31.205.50

;; AUTHORITY SECTION:
nic.cz.	1800	IN	NS	d.ns.nic.cz.
nic.cz.	1800	IN	NS	a.ns.nic.cz.
nic.cz.	1800	IN	NS	b.ns.nic.cz.

;; ADDITIONAL SECTION:
a.ns.nic.cz.	1800	IN	A	194.0.12.1
a.ns.nic.cz.	1800	IN	AAAA	2001:678:f::1
b.ns.nic.cz.	1800	IN	A	194.0.13.1
b.ns.nic.cz.	1800	IN	AAAA	2001:678:10::1
d.ns.nic.cz.	1800	IN	A	193.29.206.1
d.ns.nic.cz.	1800	IN	AAAA	2001:678:1::1

;; Query time: 0 msec
;; WHEN: Thu Jan  1 01:00:00 1970
;; MSG SIZE  rcvd: 227

[1272541060] unbound[18734:1] debug: iter_handle processing q with state QUERY 
RESPONSE STATE
[1272541060] unbound[18734:1] info: query response was ANSWER
[1272541060] unbound[18734:1] debug: iter_handle processing q with state 
FINISHED RESPONSE STATE
[1272541060] unbound[18734:1] info: finishing processing for <www.nic.cz. A IN>
[1272541060] unbound[18734:1] debug: mesh_run: iterator module exit state is 
module_finished
[1272541060] unbound[18734:1] debug: validator[module 0] operate: 
extstate:module_wait_module event:module_event_moddone
[1272541060] unbound[18734:1] info: validator operate: query <www.nic.cz. A IN>
[1272541060] unbound[18734:1] debug: validator: nextmodule returned
[1272541060] unbound[18734:1] debug: val handle processing q with state 
VAL_INIT_STATE
[1272541060] unbound[18734:1] debug: validator classification positive
[1272541060] unbound[18734:1] info: no signer, using <www.nic.cz. TYPE0 CLASS0>
[1272541060] unbound[18734:1] debug: val handle processing q with state 
VAL_FINISHED_STATE
[1272541060] unbound[18734:1] info: validation failure <www.nic.cz. A IN>: key 
for validation www.nic.cz. is marked as invalid because of a previous validation 
failure <www.nic.cz. A IN>: no DNSSEC records from 193.29.206.1 for DS 
www.nic.cz. while building chain of trust
[1272541060] unbound[18734:1] debug: mesh_run: validator module exit state is 
module_finished
[1272541060] unbound[18734:1] debug: query took 0.001500 sec
[1272541060] unbound[18734:1] info: mesh_run: end 0 recursion states (0 with 
reply, 0 detached), 0 waiting replies, 4718 recursion replies sent, 0 replies 
dropped, 0 states jostled out
[1272541060] unbound[18734:1] info: average recursion processing time 52.418403 sec
[1272541060] unbound[18734:1] info: histogram of recursion processing times
[1272541060] unbound[18734:1] info: [25%]=10.3151 median[50%]=24.2438 [75%]=26.5741
[1272541060] unbound[18734:1] info: lower(secs) upper(secs) recursions
[1272541060] unbound[18734:1] info:    0.000000    0.000001 1
[1272541060] unbound[18734:1] info:    0.001024    0.002048 1
[1272541060] unbound[18734:1] info:    0.004096    0.008192 5
[1272541060] unbound[18734:1] info:    0.008192    0.016384 4
[1272541060] unbound[18734:1] info:    0.016384    0.032768 2
[1272541060] unbound[18734:1] info:    0.032768    0.065536 7
[1272541060] unbound[18734:1] info:    0.065536    0.131072 3
[1272541060] unbound[18734:1] info:    0.131072    0.262144 8
[1272541060] unbound[18734:1] info:    0.262144    0.524288 19
[1272541060] unbound[18734:1] info:    0.524288    1.000000 84
[1272541060] unbound[18734:1] info:    1.000000    2.000000 128
[1272541060] unbound[18734:1] info:    2.000000    4.000000 144
[1272541060] unbound[18734:1] info:    4.000000    8.000000 366
[1272541060] unbound[18734:1] info:    8.000000   16.000000 207
[1272541060] unbound[18734:1] info:   16.000000   32.000000 311
[1272541060] unbound[18734:1] info:   32.000000   64.000000 1411
[1272541060] unbound[18734:1] info:   64.000000  128.000000 2017
[1272541060] unbound[18734:1] debug: cache memory msg=4456435 rrset=4455263 
infra=36173 val=4267600
[1272541060] unbound[18734:1] debug: svcd callbacks end
[1272541060] unbound[18734:1] debug: close of port 8457
[1272541060] unbound[18734:1] debug: close fd 19
--- /LOG ---

The error that can be seen above is:

"info: no signer, using <www.nic.cz. TYPE0 CLASS0>"

and then

"info: validation failure <www.nic.cz. A IN>: key for validation www.nic.cz. is 
marked as invalid because of a previous validation failure <www.nic.cz. A IN>: 
no DNSSEC records from 193.29.206.1 for DS www.nic.cz. while building chain of 
trust"

Yes, because resolver should query 193.29.206.1 for DS nic.cz., not www.nic.cz.
So it seems, that Unbound can not parse signer's name from the appropriate RRSIG 
that is cached, because there is no Unbound's query for the RRSIG on the network 
at this time.

Regards,
Zbynek