Aaron Hopkins
Tue Nov 25 19:07:05 CET 2008

On Tue, 25 Nov 2008, Wouter Wijngaards wrote:
> What is happening is that the server has blacklisted the forwarder IP
> address.  Because it does not answer any queries (it has to be
> unreachable for about 2 minutes or more for that to happen).

Turning a 2 minute outage into a 17 minute outage by default is awful
behavior.  Dmitriy is being hit particularly hard here because he's only
talking to one forwarder, but I assume this will happen just as easily with
the root, .com, etc if my internet connectivity goes down for 2 minutes but
my users are still actively trying to get somewhere new.

Blacklisting a subset of nameservers for a zone for a while is sane, as long
as you have someone left to talk to.  But as soon as all possible IPs to
send a query to are marked unresponsive, you can't just decide to not do any
lookups for the zone for an extended period.  Is it unreasonable to ask for
either a much shorter blacklist TTL in the all-IPs-unavailable case or do to
some form of low-volume probing (e.g. allow one query through per minute, as
a test)?

> It would then come back up within a minute after the connection is
> reestablished.

This should be the default.

> This config parameter also sets how long roundtrip times and
> EDNS-support is cached.

It is unfortunate that they are the same parameter.  I'd like to increase
the caching of roundtrip times and EDNS-support, and decrease the blacklist

