[Unbound-users] "peering" unbound servers together

Fri Apr 19 18:52:37 UTC 2013

On Fri, Apr 19, 2013 at 05:52:08PM +0200, Graham Beneke wrote:
> I've been trying to figure out for a while what potential optimizations
> may be possible on DNS resolvers the have high latency (50+ ms) to the
> typical locations of authoritative servers in the USA and EU.
> 
> I could cluster them together in a parent-child arrangement in order to
> get the maximum sharing of cached answers. This does however introduce
> an undesirable upstream point of failure.
> 
> I was then thinking that the following process may yield an improvement:
> 
> A query is received from a stub resolver for which an answer is not
> immediately available from the local cache. The resolver first forwards
> this query to a neighbor resolver (hoping for a cache hit) and then
> directly after that (or delayed by ~10 ms) begins its own full recursion.
> 
> We end up with 2 (or more) resolvers all racing to get to the answer
> first. Whichever answer (neighbor or authoritative) is returned to the
> original server first is then cached and returned to the stub.
> 
> This does mean that neighbor resolvers are potentially both doing the
> same recursion at the same time but I'm not too worried about this. It
> has the side effect of filling both caches with a valid answer which I
> consider a good thing. The primary objective is the fastest possible
> responses to the stub resolvers.
> 
> I don't see any immediately obvious way to build a configuration that
> will do this - have I missed something?
> 

I've also been thinking how certain things could be improved.

The reason for it was different, to reduce the number of queries being send to DNSBL
nameservers.

If you'd want to build something similar you'll probably have to do what, I
think, OpenDNS and Google Public DNS are doing.

Which is to create a shared cache.

I guess it is a bit like large websites use memcached.

This prevents having to send multiple of the same queries from different recursors
and in the ideal case you can replicate that cache information to different regions
as well.

What they also do to lower the latency of a request from a stub is to send a new
query for cached records which should be removed from the cache because the DNS TTL
is timing out.

If you do this for stubs in different regions you should probably also have an
implementation which supports EDNS client subnet.

> How difficult is it likely to be to build this capability into unbound?
> 

I do know that Bert from PowerDNS years ago added some memcache code to the
PowerDNS-recursor SVN repository but that code was never completed.

It uses UDP to talk to memcached, which should be familiar to programmers working
on recursors.

memcached has supported an UDP protocol for a long time, if you had to build it
right now, you might want to look at Redis it also now has an UDP-protocol.

Because Redis supports persistence and has support for replication.

> -- 
> Graham Beneke