Maintained by: NLnet Labs

[Unbound-users] [PATCH] performance increase with SO_REUSEPORT on Linux 3.9+

Robert Edmonds
Tue Jan 7 04:05:54 CET 2014


There is a SO_REUSEPORT socket option available on Linux 3.9+, described
in this LWN article:

It is of particular interest to DNS servers, since it allows more evenly
distributing incoming queries among multiple UDP server sockets bound to
the same port:

    As with TCP, SO_REUSEPORT allows multiple UDP sockets to be bound to
    the same port. This facility could, for example, be useful in a DNS
    server operating over UDP. With SO_REUSEPORT, each thread could use
    recv() on its own socket to accept datagrams arriving on the port.
    The traditional approach is that all threads would compete to
    perform recv() calls on a single shared socket. As with the second
    of the traditional TCP scenarios described above, this can lead to
    unbalanced loads across the threads. By contrast, SO_REUSEPORT
    distributes datagrams evenly across all of the receiving threads.

    Tom noted that the traditional SO_REUSEADDR socket option already
    allows multiple UDP sockets to be bound to, and accept datagrams on,
    the same UDP port. However, by contrast with SO_REUSEPORT,
    SO_REUSEADDR does not prevent port hijacking and does not distribute
    datagrams evenly across the receiving threads.

There is also a slide deck available here:

I've written a patch for Unbound that enables per-thread server sockets
with SO_REUSEPORT on Linux.  It must be explicitly requested with
"so-reuseport: yes" because Linux < 3.9 doesn't support the socket
option (it will fail with ENOPROTOOPT) and because it has different
semantics on Linux than on other platforms with a similarly named socket
option it has no effect on non-Linux platforms.

Under my simplistic 100% cache hit microbenchmark, Unbound with
SO_REUSEPORT support running on Linux 3.12 achieves somewhat lower CPU
utilization at all query rates except the very highest (I suspect this
is because the patched Unbound is delivering a slightly higher response
rate).  E.g., at 810Kq/s from the traffic generator, unbound trunk
(r3036) consumes 75% of the system's 4 CPUs, while unbound with
SO_REUSEPORT consumes 66%.  I've attached a plot showing the benchmark
data graphically.

Robert Edmonds
edmonds at
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-so-reuseport-option-to-enable-SO_REUSEPORT-on-li.patch
Type: text/x-diff
Size: 19367 bytes
Desc: not available
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plot-unbound-soreuseport.png
Type: image/png
Size: 436586 bytes
Desc: not available
URL: <>