Our Windows servers are registering IPv6 AAAA
records with our Windows DNS servers. However, we don't have IPv6 routing enabled on our network, so this frequently causes stall behaviours.
Microsoft RDP is the worst offender. When connecting to a server that has a AAAA
record in DNS, the remote desktop client will try IPv6 first, and won't fall back to IPv4 until the connection times out. Power users can work around this by connecting to the IP address directly. Resolving the IPv4 address with ping -4 hostname.foo
always works instantly.
What can I do to avoid this delay?
- Disable IPv6 on client?
- Nope, Microsoft says IPv6 is a mandatory part of the Windows operating system.
- Too many clients to ensure this is set everywhere consistently.
- Will cause more problems later when we finally implement IPv6.
- Disable IPv6 on the server?
- Nope, Microsoft says IPv6 is a mandatory part of the Windows operating system.
- Requires an inconvenient registry hack to disable the entire IPv6 stack.
- Ensuring this is correctly set on all servers is inconvenient.
- Will cause more problems later when we finally implement IPv6.
- Mask IPv6 records on the user-facnig DNS recursor?
- Nope, we're using NLNet Unbound and it doesn't support that.
- Prevent registration of IPv6 AAAA records on the Microsoft DNS server?
- I don't think that's even possible.
At this point, I'm considering writing a script that purges all AAAA records from our DNS zones. Please, help me find a better way.
UPDATE: DNS resolution is not the problem. As @joeqwerty points out in his answer, the DNS records are returned instantly. Both A
and AAAA
records are immediately available. The problem is that some clients (mstsc.exe
) will preferentially attempt a connection over IPv6, and take a while to fall back to IPv4.
This seems like a routing problem. The ping
command produces a "General failure" error message because the destination address is unroutable.
C:\Windows\system32>ping myhost.mydomain
Pinging myhost.mydomain [2002:1234:1234::1234:1234] with 32 bytes of data:
General failure.
General failure.
General failure.
General failure.
Ping statistics for 2002:1234:1234::1234:1234:
Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
I can't get a packet capture of this behaviour. Running this (failing) ping command does not produce any packets in Microsoft Network Monitor. Similarly, attempting a connection with mstsc.exe
to a host with an AAAA
record produces no traffic until it does a fallback to IPv4.
UPDATE: Our hosts are all using publicly-routable IPv4 addresses. I think this problem might come down to a broken 6to4 configuration. 6to4 behaves differently on hosts with public IP addresses vs RFC1918 addresses.
UPDATE: There is definitely something fishy with 6to4 on my network. When I disable 6to4 on the Windows client, connections resolve instantly.
netsh int ipv6 6to4 set state disabled
But as @joeqwerty says, this only masks the problem. I'm still trying to find out why IPv6 communication on our network is completely non-working.
This question is pretty interesting and I must admit that I've never seen this behavior. In doing some fiddling around to try and understand it better, I took a snippet of nslookup querying for one of my W2K8R2 RDS servers from another W2K8R2 server and I also captured a snippet of an RDP session to the same RDS server from the same test server. Nslookup showed no delay in returning the IPv6 record and the nslookup showed my test server querying for the IPv4 record before querying for the IPv6 record. The time delta in the capture shows no appreciable delay (that I can ascertain) in either query.
EDIT
Now you're on to something.
Make sure you're capturing traffic for the Microsoft 6To4 adapter, otherwise you won't see IPv6:
Here's the nslookup result for my RDS server. Make note of the IPv6 addresses:
Now here's a snippet of my capture:
And finally, here's a snippet from netstat showing the connection:
So clearly, as you've confirmed, DNS resolution isn't the problem. The problem is that the RDP connection prefers IPv6 over IPv4 (which is the default for Windows - Windows prefers IPv6 over IPv4) and because IPv6 isn't functioning properly it's causing the delay (as you've stated) when falling back from IPv6 to IPv4. You could fix this by configuring the clients to prefer IPv4 over IPv6, but I think that would merely be masking the problem. The better solution would be to figure out why IPv6 isn't working and fix that. I don't know enough about IPv6 to help but my guess is that the IPv6 records being returned by DNS are "local" addresses valid only on the subnet where the RDS hosts exist and since the clients are in a different subnet, they can't reach those IPv6 addresses.
The IPv6 transition technology called 6to4 is infamous for causing problems like this one. There are several factors at work. Individually they are harmless, but the combined effect is that end users can experience connection delays.
A list of enabling factors and thoughts on their mitigation is presented below.
Windows enables 6to4 by default
If your hosts are running a recent version of Windows (Vista or later), Windows will opportunistically enable 6to4 tunnelling when a publicly routable IPv4 address is available. Critically, this applies to both servers and clients.
To find out whether a system is using 6to4, run
ipconfig
and look for an IPv6 address that starts with the 6to4 prefix2002:
. It would look something like this.netsh int ipv6 6to4 set state disabled
Publicly routable IPv4 addresses are being used
6to4 only works on hosts that have publicly routable IPv4 addresses so this problem never affects hosts behind a NAT firewall.
6to4 is not functioning correctly on the network
It is infamously difficult to troubleshoot 6to4 in anycast mode. It is so troublesome that there was a formal request to the IETF that 6to4 should be reclassified as historic. In the opinion of this author, 6to4 has been deprecated.
In brief, 6to4 works by encapsulating IPv6 packets into IPv4 packets using a protocol called 6in4 (IP protocol=41). The IPv4 packets are addressed to anycast address
192.88.99.1
in hopes that it will arrive at a working 6to4 relay somewhere on the internet. It might even be geographically nearby, if you're lucky.In practice, some 6to4 relays are set up incorrectly, and a lot of networks don't even allow 6in4 traffic to cross the firewall. Typically this happens when a firewall allows all outbound traffic, but doesn’t explicitly allow IP protocol 41 packets to return through the firewall. (TODO note the relevant RFC for troubleshooting.) This failure ("inbound black hole") and many others are described in RFC 6343.
Dynamic DNS registration
In a typical Active Directory environment, every computer is permitted to register its own addresses with the DNS server. When a host is multihomed, it will register all of its addresses, even from a 6to4 tunnel.
Most internet services don't use dynamic DNS, so this problem is typically restricted to enterprise sites where the clients and servers are all "internal" to the same network.
Client application does not fail gracefully
Microsoft's RDP client is one example of a client application that does not gracefully deal with IPv6 routing problems. Most web browsers are better at dealing with IPv6 edge cases like this one, so they don’t tend to show this behaviour.
I realize it's not very helpful for this situation, but for implementors facing a similar dilemma, there is an implementation technique known as "Happy Eyeballs" (RFC 6555) that specifies a technique for connecting to ipv4 and ipv6 simultaneously and choosing whichever connects first.
Here was my solution. By default Windows gives IPv6 routes a higher priority than IPv4 routes. If you edit the IPv6 prefix policy, you can change this behavior to make it use IPv4 in preference to IPv6.
To make sure all the systems in my network are set up the same way, I put the following commands into a .bat script run during software installation after building or refurbishing a machine.
To explain what this does:
The first 3 lines disable the builtin tunneling interfaces, as they are redundant for most networks. You might want to not use those 3 lines if you are not giving your machines IPv6 addresses of their own, in my case I have a DHCPv6 server and associated infrastructure assigning IPv6 for tunneled connectivity
The second block of commands deletes all existing IPv6 routing prefix policies.
The third block then recreates the IPv6 prefix policies, but uses a different set of priorities. Like so the prefix corresponding to IPv4 is given preference over IPv6, and the machine will then want to use IPv4 unless the application specifies the use of IPv6.
This solution retains functional dual-stack capability, but the preference to use IPv4 means that sites with incomplete, unreliable, or poor performing IPv6 will avoid using it unless told to by a program on the system.
It is my opinion that making operating systems use IPv6 in preference to IPv4 is actually hindering adoption. During the transitional period, there will be times where a host thinks it has IPv6 connectivity but does not actually have a fully functional connection, leading to software malfunctions and large delays. Many people I know have disabled IPv6 entirely at their router as a workaround for ISPs deploying IPv6 in a broken fashion initially before establishing full connectivity, and these people will simply forget to enable it again leaving them without IPv6 until they reconfigure their router again.