We recently saw an issue after a fail over of our router where our Windows 2008 Boxes didn't start talking to the primary router after fail-back.
When we did some digging they still had the ARP entry from the secondary router. According to the TechNet Blog this is by-design:
First, a Windows Vista or Windows Server 2008 will not update the Neighbor cache if an ARP broadcast is received unless it is part of a broadcast ARP request for the receiver. What this means is that when a gratuitous ARP is sent on a network with Windows Vista and Widows Server 2008, these systems will not update their cache with incorrect information if there is an IP address conflict.
Secondly, it appears that the windows neighbor-cache (arp-cache) is only updated if the machine can no longer talk to the machine that is in it's cache currently. It does not send out occasional ARP requests to make sure the cache is not stale. While this isn't an issue during the initial fail over, during fail back when both boxes are alive this causes windows to keep talking to the secondary box.
Is there any way to force Windows 2008 to accept Gratuitous ARP requests?
After testing it does seem that Hotfix 2582281 fixes the issue. You can get the hotfix without having to pay support by using their hotfix request page.
I ran a test of this using
arping
and unpatched windows 2008 R2. I added a secondary IP, 64.34.119.80, to a machine with in the same network L2 segment. I then issued the following command from a different machine the network (sudo arping -U 64.34.119.80 -I bond0 -c1
). Right after that, I pinged 64.34.119.80 from the windows box after seeing it recieve the arp in wireshark. I then applied the hotfix and repeated the test.Also, it seems that the arping command needs to not use unicast MAC address but rather the broadcast MAC because this is the only type of GARP ignored from my tests.
Before the patch:
In this wireshark capture, the ping after the GARP request is not sent to the MAC Destination that the GARP came from, so you can see that GARP is being ignored.
After the patch:
In this test, after the patch, the GARP request seems to be honored as the ping is sent to the MAC address that GARP came from.
So from these tests it seems hotfix 2582281 fixes the issue of GARP broadcasts being ignored.
When researching my own TCPIP problem just now, I stumbled across this very interesting Hotfix:
http://support.microsoft.com/kb/2582281
This sounds an awful lot like what you're running into. And it's a brand new hotfix as well, released 7/22/2011, so wasn't around when you first ran into it.
Try
netsh interface ipv4 set interface x basereachable=y
where x is the interface index and y is the ARP timeout in milliseconds that you want. Remember to do it from a command prompt with admin privileges!What first hop redundancy protocol are you using?
I'm aware this doesn't answer your question directly, however VRRP (and its proprietary forebearer, HSRP) use a shared MAC address which is flipped to a new switch port when the master router changes. This gets around the need for a gratuitous ARP entirely.
Prereqs
1. WinPCAP 4.0.1 (4.1.2 version does not work)
- http://www.winpcap.org/archive/4.0.1-WinPcap.exe (Windows Version)
2. Wireshark 1.6.7
3. IPv6 disabled on network interface, due to arping restrictions
4. arping
- http://mathieu.carbou.free.fr/pub/arping/2.06/arping.zip (Windows Binary)
Execution
1. Get Inteface Name
- "E:\Program Files\Wireshark\tshark.exe" -D
- From Wireshark interface details
2. Execute arping, to send ARP Gratuitous request
- arping.exe -A -i \Device\NPF_{4399F778-AF25-4B6D-AFFB-A1F2C7DFA667} 10.20.30.50 -c 3 -S 10.20.30.50
Where 10.20.30.50 is the ipaddress you want to announce to network (router)
I encountered this on link from http://blog.serverfault.com/post/windows-2008-and-broken-arp/.
Had you asked on stackoverflow you might have had a fix much faster.
Sniff the GARP packets and run arp -s inet_addr eth_addr.
Don't do this if there's the remotest chance of getting a hostile machine on your LAN.