Maintained by: NLnet Labs

[Unbound-users] Python API extension patch proposal

Stephane LAPIE
Mon Jan 5 12:40:25 CET 2015

Hi Wouter,

On 01/05/2015 07:21 PM, W.C.A. Wijngaards wrote:
> > This patch, along with an actual module that will SERVFAIL (as
> > above) cache-missing queries going over threshold (therefore
> > reducing upstream traffic to a tenth of what it would be if
> > honoring DDoS-related queries, AND keeping it within our AS), has
> > been running in our production environment at ASAHI Net for several
> > months now, and has been approved for upstream contribution on our
> > side.
> Thank you for the patch, I have put it in the source.  Can you tell
> the allow_query() details that work for you (the threshold and what
> you do with the AS specifically)?

Many thanks for your understanding.
I'll try to provide as much notes as I am allowed to.

I don't have total clearance yet to publish the full code, or the exact
thresholds, but basically here's what I am doing and where the reasoning
comes from.

* Storing information :
I am storing query details for :
- the client's address
- the delegpt name (simplified form of,
"", derived via infra cache), which I will refer hereafter as
- the "client-domain" pair

Then, for each of these, I create python dictionaries (it's tricky to
handle locking properly though ;)) that have a series of counters for :
- Normal query count (anything short of ANY)
-> If used in conjunction with the AAAA filter, ignore these queries as
they will never return meaningful information
- ANY query count
-> This is usually blocked at firewall level, but I wanted to try a few
- NXDOMAIN count (I get the return code from another event)
-> A server capable of answering NXDOMAINs will crank up this counter
extremely fast in case of a violent attack
- RRSET count (if the query was a success, and had meaningful data, I
check the answer's RRSET count)
-> It could probably be used to detect and block weird forms of AMP
attacks, using the TXT records for instance.

Then, I have set thresholds for each of these counters, applicable to
client, domains, and client-domain pairs.
These thresholds are basically hand-set, applying roughly a 80% ratio on
the kind of fullscale DDoS we take :
- A given DDoS participant gives out 4 attack queries / second to that
domain, which is around 1200 cache-missing queries in 5 minutes.
- A given domain in a DDoS therefore takes around a hundred of these at
the very least.
- One can therefore start flagging a client-domain relation around 5
cache-missing queries in 5 minutes. (not outright blocking ! the purpose
is to contain an attack without harming legitimate use)

* Decision process (summary) :
- In either of these two cases :
-- A client breaches the "single client" threshold (this guy is sending
way too much stuff to be legit, no matter what), or the "client-domain"
maximum threshold
-- A domain is above the DDoS threshold (given how many queries it has
received in five minutes, it's being hammered), and the "client-domain"
is within suspicion range
my implementation of allow_query() will return false, and it will be
- For mitigation of false positives :
-- The above is basically mainly using the normal, ANY and NXDOMAIN
counters and positive thresholds
-- In order to avoid false positives, I check if the RRSET counter is
superior to zero, and if so, this means this domain, or client, or
client-domain pair actually have dealt with meaningful info and are
probably not implied in a bonafide DDoS, since they provide stuff that
will fill the cache with meaningful info.

* Data purging :
- Also, every minute, I decrement by a fifth of the total threshold.
This ensures I "forget" about an attacker, but not too fast.
- Last decrement step goes down to the suspicion threshold (if set), and
if the client really has been behaving properly and not hammering, then
it's fully decremented back to zero, otherwise SERVFAIL.
- Every five minutes, I weed out dictionaries entries that have been
already set at zero, to ensure I don't leak memory like crazy.

I could actually also implement dictionaries based on authority servers,
but this is highly unperformant and it's not possible to update all the
dictionaries in the time of a query answer. I think the above algorithm
already considerably slows down the whole thing, but it still works just
fine, thank gods.

* Notes :
- I have actually noticed our attackers have caught on to this blocking
method, and they started attacking domains sharing the same authority
servers, thus splitting the load across several domains.
-> Actually, though, this can probably be blocked with outgoing rate
limit on firewall level, but I was thinking it might make sense for
unbound's iterator module to store information on how much a target
server is solicited to mitigate attacks.
- I am also sending out UDP broadcasts to a monitoring server when
blocking a query, to keep tabs on blocked domains. I eventually think of
having a separate thread to get these broadcasts directly from unbound
so as to share the information that a given client is up to no-good, or
that a domain is hammered, or that a given client is forbidden from
accessing a specific domain.
-> It might be nice to have python module environment variables
accessible via a stats dump or something, food for thought.

* Measured effect :
- A client's queries are blocked, only if they are really hammering the
server, or aggravating an already overwhelmed domain.
-> Legitimate traffic or queries for cached entries are not affected,
even if they are participating in a DDoS. This avoids harming "owned"
customers, and earns the extra time needed to actually help them out.
- Queries that should go upstream to the victim authority server are
stopped, and a SERVFAIL is answered to the client.
-> This not only ensures frivolous recursive queries are not performed,
it also ensures no negative caching is done on Unbound's side, and keeps
memory and network usage tight.

Concretely, this means that, instead of doing this :
1) Client -> ISP network -> Cache server : Sending query for (confirming 3MB/s ingoing traffic)
2) Cache server -> Internet peers -> Authority server : Sending query
for (confirming 30MB/s outgoing
traffic if unfiltered)
3) Cache server <- Internet peers <- Authority server : Receiving
NXDOMAIN or timeouting, thus further insisting
4) Client <- ISP network <- Cache server : Reply timeout for client, or
SERVFAIL if all authority servers for the delegation point are confirmed
dead (which can take a lot of time or never happen), clogging the
recursive client list

Eventually, the query flow becomes this, once a domain has been
confirmed as a DDoS target and a client as an attacker :
1) Client -> ISP network -> Cache server : Sending query for (confirming 3MB/s ingoing traffic)
2) Client <- ISP network <- Cache server : Replying SERVFAIL (confirming
3MB/s outgoing traffic, which remains inside of our network and does not
impact any of our network peers), and keeping the recursive client list

This is what I meant by "traffic staying within our AS", this avoids
polluting our network peers' pipes with crud, and unnecessary transit costs.
There is no fancy coordination with our BGP routing tables (yet ;)).

* Conclusion :
Basically, being able to lookup the delegpt name is literally the
cornerstone of the whole above algorithm, as it allows to shove into one
counter hundreds of millions of queries and to finally quantify the
damage in a way that allows for fair blocking.

Stephane LAPIE, EPITA SRS, Promo 2005
"Even when they have digital readouts, I can't understand them."

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 213 bytes
Desc: OpenPGP digital signature
URL: <>