Ping a Specific Port

Question

dorian

Asked: 2015-12-03 05:39:55 +0800 CST2015-12-03 05:39:55 +0800 CST 2015-12-03 05:39:55 +0800 CST

Intermittent NXDOMAIN responses for certain records with low TTLs

772

We're experiencing a peculiar issue with our bind installation (version 9.8.4).

In this scenario, bind is configured as a caching name server for a small network. For the large majority of queries, everything works fine.

However, we've noticed that queries for some hosts that are configured with a very low TTL, we sometimes get NXDOMAIN responses even though the host name exists.

As an example, take www.cdn77.com—here's the output of dig when run on the name server itself:

$ dig www.cdn77.com

; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> www.cdn77.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34440
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 6, ADDITIONAL: 12

;; QUESTION SECTION:
;www.cdn77.com.         IN  A

;; ANSWER SECTION:
www.cdn77.com.      196 IN  CNAME   1669655317.rsc.cdn77.org.
1669655317.rsc.cdn77.org. 0 IN  A   185.59.220.12

;; AUTHORITY SECTION:
org.            170517  IN  NS  a2.org.afilias-nst.info.
org.            170517  IN  NS  c0.org.afilias-nst.info.
org.            170517  IN  NS  b0.org.afilias-nst.org.
org.            170517  IN  NS  d0.org.afilias-nst.org.
org.            170517  IN  NS  a0.org.afilias-nst.info.
org.            170517  IN  NS  b2.org.afilias-nst.org.

;; ADDITIONAL SECTION:
a0.org.afilias-nst.info. 170517 IN  A   199.19.56.1
a0.org.afilias-nst.info. 170517 IN  AAAA    2001:500:e::1
a2.org.afilias-nst.info. 170517 IN  A   199.249.112.1
a2.org.afilias-nst.info. 170517 IN  AAAA    2001:500:40::1
b0.org.afilias-nst.org. 170517  IN  A   199.19.54.1
b0.org.afilias-nst.org. 170517  IN  AAAA    2001:500:c::1
b2.org.afilias-nst.org. 170517  IN  A   199.249.120.1
b2.org.afilias-nst.org. 170517  IN  AAAA    2001:500:48::1
c0.org.afilias-nst.info. 170517 IN  A   199.19.53.1
c0.org.afilias-nst.info. 170517 IN  AAAA    2001:500:b::1
d0.org.afilias-nst.org. 170517  IN  A   199.19.57.1
d0.org.afilias-nst.org. 170517  IN  AAAA    2001:500:f::1

;; Query time: 42 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Dec  2 14:27:03 2015
;; MSG SIZE  rcvd: 487

And here's an example of when a NXDOMAIN response is returned:

$ dig www.cdn77.com

; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> www.cdn77.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 28771
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;www.cdn77.com.         IN  A

;; ANSWER SECTION:
www.cdn77.com.      327 IN  CNAME   1669655317.rsc.cdn77.org.

;; AUTHORITY SECTION:
cdn77.org.      59  IN  SOA ns1.cdn77.org. admin.cdn77.com. 1449062655 10800 180 604800 60

;; Query time: 34 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Dec  2 14:24:52 2015
;; MSG SIZE  rcvd: 115

We use Google's public name servers as forwarders, and they never seem to respond with NXDOMAIN:

$ dig www.cdn77.com @8.8.8.8

; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> www.cdn77.com @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35091
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.cdn77.com.         IN  A

;; ANSWER SECTION:
www.cdn77.com.      851 IN  CNAME   1669655317.rsc.cdn77.org.
1669655317.rsc.cdn77.org. 0 IN  A   185.59.220.11

;; Query time: 40 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Wed Dec  2 14:29:16 2015
;; MSG SIZE  rcvd: 85

The authoritive answer, by the way, looks like this:

$ dig 1669655317.rsc.cdn77.org @ns1.cdn77.org

; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> 1669655317.rsc.cdn77.org @ns1.cdn77.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11529
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;1669655317.rsc.cdn77.org.  IN  A

;; ANSWER SECTION:
1669655317.rsc.cdn77.org. 1 IN  A   185.59.220.12

;; Query time: 20 msec
;; SERVER: 37.235.105.100#53(37.235.105.100)
;; WHEN: Wed Dec  2 14:32:57 2015
;; MSG SIZE  rcvd: 58

Interestingly, even though the authorative TTL for the record is one, Google's public nameserver always reduces it to zero (see this article for an interesting read about this behavior). I don't think this has anything to do with the problem though, as the successful responses from our bind also show TTL zero.

I've increased bind's logging level, but find it very hard to identify any entries that might have something to do with the problem. Even with querylog activated, all that's visible is the query itself and resolver: debug 1: createfetch: 1669655317.rsc.cdn77.org A lines.

Any pointers towards how to better diagnose (or even solve) this issue would be greatly appreciated.

3 Answers

Voted

pete · Answer 1 · 2015-12-03T08:36:30+08:00

pete

2015-12-03T08:36:30+08:002015-12-03T08:36:30+08:00

The upstream forwarders seem to have inconsistent data - ~~although the cause of which is not clear~~ - one forwarder in your round-robin is returning NXDOMAIN which is being cached locally:

Google's Public DNS IPv6 2001:4860:4860::8888 is currently returning NXDOMAIN, despite 8.8.8.8 working correctly (ie., matching the Authoritative Answer)

The short-term solution is to remove the offending forwarder, then restart Bind or clear the negative cache.

See Alex Dupuy's answer for a clear explanation of the root cause

3

Alex Dupuy · Answer 2 · 2015-12-03T14:50:08+08:00

The problem is that the authoritative nameservers for cdn77.org fail to properly handle ECS (EDNS Client-Subnet) options when they contain an IPv6 client subnet, although they handle IPv4 client subnets just fine.

If you build dig with EDNS client-subnet support, you can check this on the command line; or you can use the online KeyCDN DNS Lookup tool to check this (select the details checkbox and de-select the recursive checkbox, and omit the @ before ns1 when you give it as Custom DNS):

$ dig 1669655317.rsc.cdn77.org @ns1.cdn77.org +subnet=2001:db8::1
; <<>> DiG 9.10.1 <<>> +additional 1669655317.rsc.cdn77.org @ns1.cdn77.org +subnet=2001:db8::1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 44989
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1680
; CLIENT-SUBNET: 2001:db8::1/128/0
;; QUESTION SECTION:
;1669655317.rsc.cdn77.org.  IN  A

;; AUTHORITY SECTION:
cdn77.org.      60  IN  SOA ns1.cdn77.org. admin.cdn77.com. 1449094813 10800 180 604800 60

;; Query time: 2 msec
;; SERVER: 37.235.105.100#53(37.235.105.100)
;; WHEN: Wed Dec 02 22:21:41 UTC 2015
;; MSG SIZE  rcvd: 132

The same query with an IPv4 client address works just fine:

$ dig 1669655317.rsc.cdn77.org @ns1.cdn77.org +subnet=192.0.2.1
; <<>> DiG 9.10.1 <<>> +additional 1669655317.rsc.cdn77.org @ns1.cdn77.org +subnet=192.0.2.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19104
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1680
; CLIENT-SUBNET: 192.0.2.1/32/32
;; QUESTION SECTION:
;1669655317.rsc.cdn77.org.  IN  A

;; ANSWER SECTION:
1669655317.rsc.cdn77.org. 1 IN  A   185.93.3.27

;; Query time: 2 msec
;; SERVER: 37.235.105.100#53(37.235.105.100)
;; WHEN: Wed Dec 02 22:42:13 UTC 2015
;; MSG SIZE  rcvd: 81

When you send your query to an IPv6 address for Google Public DNS, your client IP subnet is of course an IPv6 subnet, and when the authoritative server answers NXDOMAIN, the (cached?) answer for IPv6 clients is NXDOMAIN too. If you send your query to an IPv4 address for Google Public DNS, your client subnet is an IPv4 subnet, and you get the correct (possibly cached) answer.

cdn77 · Answer 3 · 2015-12-10T07:28:54+08:00

cdn77

2015-12-10T07:28:54+08:002015-12-10T07:28:54+08:00

Sorry for the inconvenience, this bug has been causing problems only to a handful of our clients, Alex Dupuy has provided great explanation of the problem. We have added IPv6 EDNS support and enabled IPv6 anycast on our DNS servers and this problem is now gone.

1

Intermittent NXDOMAIN responses for certain records with low TTLs

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?