I've recently switched out a failed router... and, a while later, I discovered a lot (at least an order or magnitude more than the number of queries) of errors reported in /var/log/syslog - of the form:
Mar 18 19:53:20 kenneth named[4022]: DNS format error from 192.112.36.4#53 resolving ./NS: non-improving referral
Mar 18 19:53:20 kenneth named[4022]: error (FORMERR) resolving './NS/IN': 192.112.36.4#53
It might be relevant that I've got the following in bind.conf:
dnssec-enable no;
dnssec-validation no;
Is this likely an issue with the new router corrupting UDP datagrams, or something else? The new router is an (inexpensive) Netgear WNR854T - it has the latest firmware applied.
Can anyone suggest how best to diagnose this fault if it's not obvious from the above?
-- Additional details -- This is a typical response from dig for an address I'm sure should resolve.
$ dig A barclays.co.uk ~
; <<>> DiG 9.8.1-P1 <<>> A barclays.co.uk
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22161
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;barclays.co.uk. IN A
;; Query time: 84 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Mar 20 23:08:40 2013
;; MSG SIZE rcvd: 32
$
If you suspect a network element (such as your router) is truncating or corrupting UDP DNS traffic you can try the following:
Are all of the FORMERR errors in your log complaining about non-improving referrals? What does dig say is in the "additional" section of the queries that generate these error messages?
Finally, do you have stub zones or forward-first or forward-only zones set up that you haven't mentioned?
I am visiting in Indonesia and I get these messages because the operator/government sabotages DNS. DNSSEC fails to verify and direct access to root servers causes a FORMERR message to log (bind9, 13 messages per every DNS query). Address in your example is DNS root server G.
Your DNS traffic may be transparently intercepted, and routed to a caching DNS server and you request answers that cannot be completely answered with brief answers and the caching DNS server gives truncated, rather than minimal answers ("minimal-responses yes;"). For example, www.nvidia.com.edgekey.net is resolved via CNAMES and a long list of nameservers, and it complete answer does not fit in a 500~ish byte response. Here are the steps:
The OP question includes an equivalent description of the problem (non-improving NS record == unrelated record in authority section):
Arguably the problem is one or several bugs in BIND itself:
I should note that queries that can be answered completely in a small packet do not trigger these bugs, and therefore, most DNS queries are resolved correctly.
Possible solutions: