Maintained by: NLnet Labs

[Unbound-users] Old or incorrect information returned?

W.C.A. Wijngaards
Fri Nov 6 14:54:05 CET 2009


On 11/06/2009 01:53 PM, Haw Loeung wrote:
> On Fri, Nov 6th, 2009 at 6:30 PM, "W.C.A. Wijngaards"<wouter at NLnetLabs.nl>  wrote:
>> It could also be a bug where due to a miscalculation inside
>> the resolver the TTL becomes -1 (or infinite), but although
>> such a bug is fixed recently (in svn trunk) for DNSSEC bogus
>> messages, my guess is you are not DNSSEC validating.
>>
>
> Thinking about this further, if the TTL becomes -1, shouldn't it consider that cache entry stale and
> look it up again? I mean, is there any reason for entries to be in the cache forever?

This is a bug and should not happen, actually -1 is a misnomer, I mean a 
miscalculation with a too large TTL as the result, fairly rare though it 
hits DNS resolvers from time to time.  If this really happens, you can 
spot it with the unbound-control dump_cache statement - with insanely 
large values for the domain in question (either RR or msg).

Unbound does not lock-on to old NS servers even if they keep serving the 
old zone.  After the NS record times out it should get the delegation 
again and use that until it expires.  This can mean that if it fetches 
an NS record that holds for 24h, and at the end of that fetches the A 
record that holds for 24h, that 48h after the redelegation the A record 
is still the old one.  This is a worst case scenario with the refetch of 
NS happening at the last time the old NS is available and the A getting 
refetched just before the bad NS record expires.  After that the TTLs 
expire and the new data is used.

The phrase 'all the others do not have the problem please reload your 
cache' might have been used with all the other isps too :-)  No, with 
unbound newer it is probably more likely to be different in any case.

 From your data they waited for one day.  Could it be this that 
triggered the complaint?  Different way of keeping the cache and the old 
servers still serving the old zone, turns the TTL until the new data is 
displayed larger than the one day they expected?  (or did they expect 
even faster convergence, even though they used a 24h TTL?)

Best regards,
    Wouter