Maintained by: NLnet Labs

[Unbound-users] Unbound issues

W.C.A. Wijngaards
Wed Jan 20 09:41:21 CET 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Rod,

With unbound-control dump_requestlist you can get a view of what those
2000 queries are that it is working on (you have then to configure
unbound-control-setup, set remote-control: yes in config).  Presumably
unbound is receiving a lot of queries that take very long to resolve.

Without the other elements in this mail I would suggest you could
increase the outgoing range and num-queries-per-thread to 4096.

The jostle-timeout is also very large, meaning that this is causing
unbound to be unable to remove those very long taking queries from its
requestlist when exceeding its limits.  jostle-timeout: 100 or 200 could
help unbound remove the 'bad queries' earlier on.

Also, because you have harden-referral-path set, unbound is generating
internal work for those checks.  This needs extra outgoing-range (but
not num-queries-per-thread), so try outgoing-range: 4096 and
num-queries-per-thread: 2048    .  So that there is plenty of space for
the check queries, otherwise an internal queue could be forming to get
an outgoing-range(source port number), but because you hit the max
queries per thread and all those spawn extra checks, the limit is
reached with certainty, slowing down the server considerably, (and this
slowdown may even trigger a 'death spiral' as it causes even longer
waiting lists - hence perhaps the large 'exceeded' values).

Why thread 2 gets so many queries, it could be that the quad cpu is
really hyperthreaded?  Or else, perhaps the OS wants to run the network
stack on some of the CPUs, causing not all unbound threads to get work
(as they are scheduled out in favor of the network stack), in which case
less unbound threads could remove this behaviour.  Another reason could
be simply that the load on the machine is so light that the OS decides
thread 2 can handle it...

Another disconcerting fact is that the cache hit rate is 54% when its
still OK but drops to 26% when you get trouble.  Take a look at that
dump-requestlist, and also perform a unbound-control stats_noreset >
snapshot and look at it: the number of unwanted replies in particular.
Certain forms of Kaminsky-eeks! would tend to blast lots of different
queries to a resolver...

Best regards,
   Wouter


On 01/19/2010 11:15 PM, Rod Taylor wrote:
> We are trying unbound out. Machine is quad CPU with 16GB ram. I'm
> picking on thread 2 because it shows the issue best as it seems to
> receive up to 6 times as many queries as the other 3 threads. As you
> can see, the time to execute climbs gradually through the period and
> starts to "exceed" allowed limits about mid-way down the stats results
> shown here.
> 
> Config bits.
> server:
> 	verbosity: 1
> 	statistics-interval: 300
> 	statistics-cumulative: yes
> 	extended-statistics: no
> 	num-threads: 4
> 	outgoing-range: 2048
> 	msg-cache-size: 3G
> 	msg-cache-slabs: 8
> 	num-queries-per-thread: 2048
> 	jostle-timeout: 1000
> 	so-rcvbuf: 8M
> 	rrset-cache-size: 6G
> 	rrset-cache-slabs: 8
> 	cache-min-ttl: 7776000
> 	harden-glue: yes
> 	harden-dnssec-stripped: yes
> 	harden-referral-path: yes
> 	use-caps-for-id: yes
> 	unwanted-reply-threshold: 10000000
> 	val-clean-additional: yes
> 	val-permissive-mode: no
> 	key-cache-slabs: 8
> 	neg-cache-size: 1g
> 
> 
> Below are snippets from the statistics taken roughly every 5 minutes
> from startup over a period of 7 hours.
> 
> server stats for thread 2: 420 queries, 284 answers from cache, 136 recursions
> server stats for thread 2: requestlist max 21 avg 16.2647 exceeded 0
> server stats for thread 2: 3189 queries, 3053 answers from cache, 136 recursions
> server stats for thread 2: requestlist max 21 avg 16.2647 exceeded 0
> server stats for thread 2: 3375 queries, 3239 answers from cache, 136 recursions
> server stats for thread 2: requestlist max 21 avg 16.2647 exceeded 0
> server stats for thread 2: 4560 queries, 4424 answers from cache, 136 recursions
> server stats for thread 2: requestlist max 21 avg 16.2647 exceeded 0
> server stats for thread 2: 32129 queries, 13211 answers from cache,
> 18918 recursions
> server stats for thread 2: requestlist max 1196 avg 620.017 exceeded 0
> server stats for thread 2: 227635 queries, 63600 answers from cache,
> 164035 recursions
> server stats for thread 2: requestlist max 1196 avg 605.375 exceeded 0
> server stats for thread 2: 412374 queries, 113701 answers from cache,
> 298673 recursions
> server stats for thread 2: requestlist max 1196 avg 649.591 exceeded 0
> server stats for thread 2: 591627 queries, 164560 answers from cache,
> 427067 recursions
> server stats for thread 2: requestlist max 1196 avg 685.99 exceeded 0
> server stats for thread 2: 762418 queries, 214876 answers from cache,
> 547542 recursions
> server stats for thread 2: requestlist max 1196 avg 725.279 exceeded 0
> server stats for thread 2: 925072 queries, 265281 answers from cache,
> 659791 recursions
> server stats for thread 2: requestlist max 1196 avg 766.182 exceeded 0
> server stats for thread 2: 1082217 queries, 315509 answers from cache,
> 766708 recursions
> server stats for thread 2: requestlist max 1213 avg 803.033 exceeded 0
> server stats for thread 2: 1240514 queries, 367629 answers from cache,
> 872885 recursions
> server stats for thread 2: requestlist max 1336 avg 841.245 exceeded 0
> server stats for thread 2: 1399258 queries, 421640 answers from cache,
> 977618 recursions
> server stats for thread 2: requestlist max 1360 avg 874.701 exceeded 0
> server stats for thread 2: 1555372 queries, 476520 answers from cache,
> 1078852 recursions
> server stats for thread 2: requestlist max 1390 avg 908.038 exceeded 0
> server stats for thread 2: 1714148 queries, 534429 answers from cache,
> 1179719 recursions
> server stats for thread 2: requestlist max 1478 avg 942.845 exceeded 0
> server stats for thread 2: 1877100 queries, 595948 answers from cache,
> 1281152 recursions
> server stats for thread 2: requestlist max 1603 avg 977.634 exceeded 0
> server stats for thread 2: 2046507 queries, 661158 answers from cache,
> 1385349 recursions
> server stats for thread 2: requestlist max 1661 avg 1018.34 exceeded 0
> server stats for thread 2: 2230528 queries, 734070 answers from cache,
> 1496458 recursions
> server stats for thread 2: requestlist max 1792 avg 1063.32 exceeded 0
> server stats for thread 2: 2420948 queries, 811552 answers from cache,
> 1609396 recursions
> server stats for thread 2: requestlist max 1876 avg 1106.96 exceeded 0
> server stats for thread 2: 2627825 queries, 898335 answers from cache,
> 1729490 recursions
> server stats for thread 2: requestlist max 1948 avg 1152.9 exceeded 0
> server stats for thread 2: 2847864 queries, 993012 answers from cache,
> 1854852 recursions
> server stats for thread 2: requestlist max 2010 avg 1198.99 exceeded 0
> server stats for thread 2: 3075730 queries, 1093945 answers from
> cache, 1981785 recursions
> server stats for thread 2: requestlist max 2088 avg 1244.61 exceeded 0
> server stats for thread 2: 3320791 queries, 1204754 answers from
> cache, 2116037 recursions
> server stats for thread 2: requestlist max 2129 avg 1289.26 exceeded 0
> server stats for thread 2: 3593773 queries, 1329840 answers from
> cache, 2263933 recursions
> server stats for thread 2: requestlist max 2129 avg 1333.68 exceeded 0
> server stats for thread 2: 3877901 queries, 1462415 answers from
> cache, 2415486 recursions
> server stats for thread 2: requestlist max 2340 avg 1381.81 exceeded 0
> server stats for thread 2: 4169522 queries, 1601187 answers from
> cache, 2568335 recursions
> server stats for thread 2: requestlist max 2340 avg 1424.49 exceeded 0
> server stats for thread 2: 4458010 queries, 1741370 answers from
> cache, 2716640 recursions
> server stats for thread 2: requestlist max 2340 avg 1461.19 exceeded 0
> server stats for thread 2: 4755954 queries, 1887372 answers from
> cache, 2868582 recursions
> server stats for thread 2: requestlist max 2340 avg 1497.73 exceeded 0
> server stats for thread 2: 5054311 queries, 2035933 answers from
> cache, 3018378 recursions
> server stats for thread 2: requestlist max 2392 avg 1534.58 exceeded 0
> server stats for thread 2: 5362834 queries, 2191665 answers from
> cache, 3171169 recursions
> server stats for thread 2: requestlist max 2438 avg 1571.12 exceeded 0
> server stats for thread 2: 5669256 queries, 2347984 answers from
> cache, 3321272 recursions
> server stats for thread 2: requestlist max 2463 avg 1604.42 exceeded 0
> server stats for thread 2: 5975214 queries, 2506445 answers from
> cache, 3468769 recursions
> server stats for thread 2: requestlist max 2463 avg 1635.18 exceeded 0
> server stats for thread 2: 6279369 queries, 2665381 answers from
> cache, 3613988 recursions
> server stats for thread 2: requestlist max 2591 avg 1664.99 exceeded 0
> server stats for thread 2: 6577860 queries, 2822890 answers from
> cache, 3754970 recursions
> server stats for thread 2: requestlist max 2591 avg 1691.34 exceeded 0
> server stats for thread 2: 6879207 queries, 2983199 answers from
> cache, 3896008 recursions
> server stats for thread 2: requestlist max 2591 avg 1717.35 exceeded 0
> server stats for thread 2: 7175981 queries, 3141895 answers from
> cache, 4034086 recursions
> server stats for thread 2: requestlist max 2607 avg 1744.31 exceeded 0
> server stats for thread 2: 7468573 queries, 3300361 answers from
> cache, 4168212 recursions
> server stats for thread 2: requestlist max 2607 avg 1768.51 exceeded 0
> server stats for thread 2: 7763213 queries, 3459687 answers from
> cache, 4303526 recursions
> server stats for thread 2: requestlist max 2615 avg 1792.17 exceeded 275
> server stats for thread 2: 8049494 queries, 3615698 answers from
> cache, 4433796 recursions
> server stats for thread 2: requestlist max 2649 avg 1814.33 exceeded 391
> server stats for thread 2: 8339874 queries, 3771948 answers from
> cache, 4567926 recursions
> server stats for thread 2: requestlist max 2673 avg 1836.72 exceeded 9291
> server stats for thread 2: 8675486 queries, 3928361 answers from
> cache, 4747125 recursions
> server stats for thread 2: requestlist max 2673 avg 1865.03 exceeded 79527
> server stats for thread 2: 9013544 queries, 4065706 answers from
> cache, 4947838 recursions
> server stats for thread 2: requestlist max 2673 avg 1894.08 exceeded 197196
> server stats for thread 2: 9325951 queries, 4184820 answers from
> cache, 5141131 recursions
> server stats for thread 2: requestlist max 2673 avg 1919.66 exceeded 320054
> server stats for thread 2: 9649570 queries, 4311387 answers from
> cache, 5338183 recursions
> server stats for thread 2: requestlist max 2673 avg 1943.58 exceeded 442555
> server stats for thread 2: 9883086 queries, 4386420 answers from
> cache, 5496666 recursions
> server stats for thread 2: requestlist max 2673 avg 1961.76 exceeded 567156
> server stats for thread 2: 10187789 queries, 4502652 answers from
> cache, 5685137 recursions
> server stats for thread 2: requestlist max 2673 avg 1982.28 exceeded 690181
> server stats for thread 2: 10527437 queries, 4638576 answers from
> cache, 5888861 recursions
> server stats for thread 2: requestlist max 2673 avg 2002.7 exceeded 818783
> server stats for thread 2: 10845189 queries, 4760934 answers from
> cache, 6084255 recursions
> server stats for thread 2: requestlist max 2673 avg 2020.86 exceeded 946745
> server stats for thread 2: 11084550 queries, 4838916 answers from
> cache, 6245634 recursions
> server stats for thread 2: requestlist max 2673 avg 2034.8 exceeded 1073714
> server stats for thread 2: 11402798 queries, 4959856 answers from
> cache, 6442942 recursions
> server stats for thread 2: requestlist max 2673 avg 2051.26 exceeded 1202308
> server stats for thread 2: 11722778 queries, 5082975 answers from
> cache, 6639803 recursions
> server stats for thread 2: requestlist max 2673 avg 2066.72 exceeded 1330516
> server stats for thread 2: 11993865 queries, 5179640 answers from
> cache, 6814225 recursions
> server stats for thread 2: requestlist max 2673 avg 2079.3 exceeded 1457073
> server stats for thread 2: 12286524 queries, 5286754 answers from
> cache, 6999770 recursions
> server stats for thread 2: requestlist max 2673 avg 2091.88 exceeded 1588787
> server stats for thread 2: 12569883 queries, 5390261 answers from
> cache, 7179622 recursions
> server stats for thread 2: requestlist max 2673 avg 2103.54 exceeded 1717405
> server stats for thread 2: 12833143 queries, 5480857 answers from
> cache, 7352286 recursions
> server stats for thread 2: requestlist max 2673 avg 2114.34 exceeded 1848543
> server stats for thread 2: 13088804 queries, 5566827 answers from
> cache, 7521977 recursions
> server stats for thread 2: requestlist max 2673 avg 2124.56 exceeded 1979982
> server stats for thread 2: 13339028 queries, 5649428 answers from
> cache, 7689600 recursions
> server stats for thread 2: requestlist max 2673 avg 2134.2 exceeded 2112551
> server stats for thread 2: 13589261 queries, 5731389 answers from
> cache, 7857872 recursions
> server stats for thread 2: requestlist max 2673 avg 2143.23 exceeded 2246431
> server stats for thread 2: 13831526 queries, 5809218 answers from
> cache, 8022308 recursions
> server stats for thread 2: requestlist max 2673 avg 2151.68 exceeded 2380025
> server stats for thread 2: 14074045 queries, 5885578 answers from
> cache, 8188467 recursions
> server stats for thread 2: requestlist max 2673 avg 2159.78 exceeded 2517341
> server stats for thread 2: 14310994 queries, 5958194 answers from
> cache, 8352800 recursions
> server stats for thread 2: requestlist max 2673 avg 2167.59 exceeded 2656417
> server stats for thread 2: 14542656 queries, 6028031 answers from
> cache, 8514625 recursions
> server stats for thread 2: requestlist max 2673 avg 2174.91 exceeded 2795338
> server stats for thread 2: 14768848 queries, 6093829 answers from
> cache, 8675019 recursions
> server stats for thread 2: requestlist max 2673 avg 2181.93 exceeded 2936464
> server stats for thread 2: 14993464 queries, 6158626 answers from
> cache, 8834838 recursions
> server stats for thread 2: requestlist max 2673 avg 2188.48 exceeded 3079132
> server stats for thread 2: 15216858 queries, 6222283 answers from
> cache, 8994575 recursions
> server stats for thread 2: requestlist max 2673 avg 2194.84 exceeded 3222538
> server stats for thread 2: 15440568 queries, 6282954 answers from
> cache, 9157614 recursions
> server stats for thread 2: requestlist max 2673 avg 2201.12 exceeded 3372315
> server stats for thread 2: 15663324 queries, 6344440 answers from
> cache, 9318884 recursions
> server stats for thread 2: requestlist max 2673 avg 2207.11 exceeded 3520766
> server stats for thread 2: 15888324 queries, 6406005 answers from
> cache, 9482319 recursions
> server stats for thread 2: requestlist max 2673 avg 2212.97 exceeded 3671367
> server stats for thread 2: 16110454 queries, 6465240 answers from
> cache, 9645214 recursions
> server stats for thread 2: requestlist max 2673 avg 2218.59 exceeded 3824213
> server stats for thread 2: 16332045 queries, 6523675 answers from
> cache, 9808370 recursions
> server stats for thread 2: requestlist max 2673 avg 2224.04 exceeded 3978297
> server stats for thread 2: 16552213 queries, 6581148 answers from
> cache, 9971065 recursions
> server stats for thread 2: requestlist max 2673 avg 2229.27 exceeded 4132981
> server stats for thread 2: 16772511 queries, 6639153 answers from
> cache, 10133358 recursions
> server stats for thread 2: requestlist max 2673 avg 2234.27 exceeded 4287371
> server stats for thread 2: 16992902 queries, 6698239 answers from
> cache, 10294663 recursions
> server stats for thread 2: requestlist max 2673 avg 2239.12 exceeded 4439600
> server stats for thread 2: 17213660 queries, 6756507 answers from
> cache, 10457153 recursions
> server stats for thread 2: requestlist max 2673 avg 2243.8 exceeded 4594324
> server stats for thread 2: 17432380 queries, 6812701 answers from
> cache, 10619679 recursions
> server stats for thread 2: requestlist max 2673 avg 2248.35 exceeded 4750633
> server stats for thread 2: 17650889 queries, 6868541 answers from
> cache, 10782348 recursions
> server stats for thread 2: requestlist max 2673 avg 2252.86 exceeded 4907895
> 
> Thanks in advance for any suggestion you can give.
> _______________________________________________
> Unbound-users mailing list
> Unbound-users at unbound.net
> http://unbound.nlnetlabs.nl/mailman/listinfo/unbound-users

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAktWwbEACgkQkDLqNwOhpPjY+gCgipjS1LfeAvGfjQzkwYui9Bhx
6q8AmQFcmxcXpGWC+KnOdMYTZItmGUyv
=614H
-----END PGP SIGNATURE-----