Ping a Specific Port

Question

Andrew B

Asked: 2014-06-14 16:44:30 +0800 CST2014-06-14 16:44:30 +0800 CST 2014-06-14 16:44:30 +0800 CST

Why does dig +trace sometimes fail against Windows Server DNS?

772

My team has a server pointing at the DNS supplied by Active Directory to ensure that it is able to reach any hosts managed by the domain. Unfortunately, my team also needs to run dig +trace frequently and we sporadically get strange results. I am a DNS admin but not a domain admin, but the team responsible for these servers isn't sure what is going on here either.

The problem seems to have shifted around between OS upgrades, but it's hard to say whether that's a characteristic of the OS version or other settings being changed during the upgrade process.

When the upstream servers were Windows Server 2003, the first step of dig +trace (request . IN NS from the first entry in /etc/resolv.conf) would occasionally return 0 byte responses.
When the upstream servers were upgraded to Windows Server 2012, the zero byte response problem went away but was replaced with an issue where we would sporadically get the list of forwarders configured on the DNS server.

Example of the second problem:

$ dig +trace -x 1.2.3.4      

; <<>> DiG 9.8.2 <<>> +trace -x 1.2.3.4
;; global options: +cmd
.                       3600    IN      NS      dns2.ad.example.com.
.                       3600    IN      NS      dns1.ad.example.com.
;; Received 102 bytes from 192.0.2.11#53(192.0.2.11) in 22 ms

1.in-addr.arpa.         84981   IN      NS      ns1.apnic.net.
1.in-addr.arpa.         84981   IN      NS      tinnie.arin.net.
1.in-addr.arpa.         84981   IN      NS      sec1.authdns.ripe.net.
1.in-addr.arpa.         84981   IN      NS      ns2.lacnic.net.
1.in-addr.arpa.         84981   IN      NS      ns3.apnic.net.
1.in-addr.arpa.         84981   IN      NS      apnic1.dnsnode.net.
1.in-addr.arpa.         84981   IN      NS      ns4.apnic.net.
;; Received 507 bytes from 192.0.2.228#53(192.0.2.228) in 45 ms

1.in-addr.arpa.         172800  IN      SOA     ns1.apnic.net. read-txt-record-of-zone-first-dns-admin.apnic.net.
4827 7200 1800 604800 172800
;; Received 127 bytes from 202.12.28.131#53(202.12.28.131) in 167 ms

In most cases this isn't a problem, but it will cause dig +trace to follow the wrong path if we are tracing within a domain that AD has an internal view for.

Why is dig +trace losing its mind? And why do we seem to be the only ones complaining?

1 Answers

Voted

Andrew B · Answer 1 · 2014-06-14T16:44:30+08:00

You are being trolled by root hints. This one is tricky to troubleshoot, and it hinges on understanding that the . IN NS query sent at the start of a trace does not set the RD (recursion desired) flag on the packet.

When Microsoft's DNS server receives a non-recursive request for the root nameservers, it's possible that they will return the configured root hints. So long as you do not add the RD flag to the request, the server will happily continue to return that same response with a fixed TTL all day long.

$ dig @192.0.2.11 +norecurse . NS

; <<>> DiG 9.8.2 <<>> @192.0.2.11 +norecurse . NS
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12586
;; flags: qr ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2

;; QUESTION SECTION:
;.                              IN      NS

;; ANSWER SECTION:
.                       3600    IN      NS      dns2.ad.example.com.
.                       3600    IN      NS      dns1.ad.example.com.

;; ADDITIONAL SECTION:
dns2.ad.example.com.    3600    IN      A       192.0.2.228
dns1.ad.example.com.    3600    IN      A       192.0.2.229

This is where most troubleshooting efforts will break down, because the easy assumption to leap to is that dig @whatever . NS will reproduce the problem, which actually masks it completely. When the server gets a request for root nameservers with the RD flag set, it will reach out and grab a copy of the real root nameservers, and all subsequent requests for . NS without the RD flag will magically start working as expected. This makes dig +trace happy again, and everyone can go back to scratching their heads until the problem reappears.

Your options are to either negotiate a different configuration with your domain admins, or to work around the problem. So long as the poisoned root hints are good enough in most circumstances (and you're aware of the circumstances where they're not: conflicting views, etc.), this isn't a huge inconvenience.

Some workarounds without changing the root hints are:

Run your traces on a machine that has a less nutty set of default resolvers.
Start your trace from a nameserver that returns root name servers for the internet in response to . NS. You can also hardwire this nameserver into ${HOME}/.digrc, but this may confuse others on a shared account or be forgotten by you at some point. dig @somethingelse +trace example.com
Seed the root hints yourself prior to running your trace.
dig . NS
dig +trace example.com

Why does dig +trace sometimes fail against Windows Server DNS?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?