Maintained by: NLnet Labs

[Unbound-users] Unbound stops responding

W.C.A. Wijngaards
Fri Aug 16 15:24:39 CEST 2013


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Petter,

On windows we currently have a 64 limit, not a 1024 limit.  This is
because of a windows limit for its waitformultipleevents winsock
function.  And we need to wait for 'locks' for inter-thread
communication.  Windows does not implement unix-pipes that we use on Unix.

So, the machine is massively not able to keep up with the work.

When you have the issue, the number of cache hits is roughly the same
(this indicates it did respond to cache hits?).  But the non-cache
is 600.000 (20x the normal number).  This causes the CPU spike.  And it
cannot answer any of these questions because it has no sockets, and the
upstream is unresponsive anyway.

I think the 20x traffic increase may be why it does not respond to cache
responses.  It should have responded to cache responses, but if the
network stack is swamped with packets, unbound does not have a chance.

Or is the machine easily capable of 20x the network load? (note that
UDP load may be different from other load).  Then we should look at
the performance under this denialofservice condition.  Especially I am
worried because your num-queries-per-thread is a lot bigger than what
you have sockets for, which does not happen (usually) for unices.

Best regards,
   Wouter

On 08/16/2013 02:38 PM, Petter Lindgren wrote:
> We’ve used Unbound for a few months for around 15 000 clients I our
> company.
> 
> 
> 
> We recently had an issue when our Internet provider had a big
> network error.
> 
> This meant that our three unbound resolvers couldn’t perform
> recursion.
> 
> 
> 
> But during the outage, they stopped responding to any queries and
> the CPU spiked.
> 
> This meant that we couldn’t resolve local names.
> 
> After the outage was resolved, they started responding as usual
> again.
> 
> 
> 
> Statistics before outage:
> 
> 2013-08-14 11:40:21 unbound[1180:0] info: server stats for thread
> 0: 137441 queries, 107082 answers from cache, 30359 recursions, 0
> prefetch
> 
> 2013-08-14 11:40:21 unbound[1180:0] info: server stats for thread
> 0: requestlist max 38 avg 4.54215 exceeded 0 jostled 0
> 
> 2013-08-14 11:40:21 unbound[1180:0] info: average recursion
> processing time 0.120068 sec
> 
> 2013-08-14 11:40:21 unbound[1180:0] info: histogram of recursion 
> processing times
> 
> 2013-08-14 11:40:21 unbound[1180:0] info: [25%]=0.00974991 
> median[50%]=0.0314827 [75%]=0.162277
> 
> 
> 
> Statistics during outage:
> 
> 2013-08-14 12:40:21 unbound[1180:0] info: server stats for thread
> 0: 717834 queries, 79368 answers from cache, 638466 recursions, 0
> prefetch
> 
> 2013-08-14 12:40:21 unbound[1180:0] info: server stats for thread
> 0: requestlist max 893 avg 649.479 exceeded 520726 jostled 90153
> 
> 2013-08-14 12:40:21 unbound[1180:0] info: average recursion
> processing time 16.060234 sec
> 
> 2013-08-14 12:40:21 unbound[1180:0] info: histogram of recursion 
> processing times
> 
> 2013-08-14 12:40:21 unbound[1180:0] info: [25%]=0.0128234 
> median[50%]=0.046963 [75%]=0.241531
> 
> 
> 
> 
> 
> What can I do to make sure that Unbound still responds to a local
> query when there is an outage?
> 
> 
> 
> We run Unbound on Windows Server 2012 x64 (because we have more 
> competence using Windows)
> 
> 
> 
> Is it possible to overcome the 1024 limit for /outgoing-range/ and 
> /num-queries-per-thread/ when running Windows and would it have
> helped?
> 
> 
> 
> I would really appreciate comments and suggestions!
> 
> 
> 
> Thanks!
> 
> / /
> 
> /_________________________________________/
> 
> /Petter Lindgren/
> 
> 
> 
> 
> 
> _______________________________________________ Unbound-users
> mailing list Unbound-users at unbound.net 
> http://unbound.nlnetlabs.nl/mailman/listinfo/unbound-users
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJSDigXAAoJEJ9vHC1+BF+NyV0P/RcaxZWnCJMEO8IuOJRsaV16
zLf50RySzGDQQ6cljmWrDZbmKvVmxPiZt0Tw+EP49oNvu6Fwm5P+tixYH9ojAm5G
JV7PdB2Nn7AvZy//s5ZjSjZpPKDyRKevz9kjE67pHhdAUjz2SMadPC0OqrEQCMS0
8hBQmVvJBi4xEuGPbuhBB0oAMY2GX+9HE2eRBkVWB1tN1ESgrXDZZ/uYd+bDQJu1
HAph5UfFtQ8KFDF5LTVq4tlQU28UopFWelneOrGbOLdbquHlw/aangJkeltw+UHH
II0wwMVPLL/Tk5eJHQgyKFLsu1pfrosYmIhyxuUMnLdgaetD6A1DTaJb+C2vLXYG
JYAs9n6t7ijf1TIyVQvZPDWnvB53TQ+97TCPvYwCdammtTNWrGg/0F5CVanUl2Su
kMLDvBBbMdpQiMwIqFesvF50Dy7zeoyAAWxbW4RkXJekixzmFr0Q4WaXMDFsEfhk
C5/IR0vfdjQ0jmg+Qr5OO9Hyh5sNhMr7xIWOA4ugQFcyxxk08BZJF5WEWKUTyI/8
DRybtzWkFBXBnMj5ZjcNZ6a5zMDz110RDGEFZ7ABKnWHxjAFF5Kgvyx348XH+Tvk
6zdvpSQf9u2eVR5ckz5h7+FV+/KzcwQn2gp31lNDQ8ycfOHYE/iGQtdHwQcEIlw8
s0JUBRIQRRowu8M8YhHz
=3RNT
-----END PGP SIGNATURE-----