This question was originally posted on the Network Engineering SE site, but closed for being off-topic.
I've stumbled upon a peculiar behavior of Windows hosts in our network.
Our clients are mostly in the wireless segment of our network. The Ubiquiti access points we use block multicast and broadcast traffic from the wired segment to the wireless segment for performance reasons.
When a host (client) on the WiFi side contacts a host (server) on the wired network that is running Linux, everything works fine. The client sends out an ARP request that is broadcast over to the wired segment, the server responds using a unicast ARP reply and learns the client's MAC address in the process. Then IP packets can be exchanged normally.
With a Windows server, however, it looks different. If I ping the server at 10.10.10.4 from the client at 10.10.244.14, I see the following packets on the server's interface:
15:11:50.892975 4c:32:75:95:b9:7b > Broadcast, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.4 tell 10.10.244.14, length 46
15:11:50.893725 52:75:9a:65:57:e4 > 4c:32:75:95:b9:7b, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Reply 10.10.10.4 is-at 52:75:9a:65:57:e4, length 46
15:11:50.898967 4c:32:75:95:b9:7b > 52:75:9a:65:57:e4, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 39008, offset 0, flags [none], proto ICMP (1), length 84)
10.10.244.14 > 10.10.10.4: ICMP echo request, id 34083, seq 0, length 64
15:11:50.899285 52:75:9a:65:57:e4 > Broadcast, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.244.14 tell 10.10.10.4, length 46
15:11:51.711006 52:75:9a:65:57:e4 > Broadcast, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.244.14 tell 10.10.10.4, length 46
15:11:51.895016 4c:32:75:95:b9:7b > 52:75:9a:65:57:e4, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 9988, offset 0, flags [none], proto ICMP (1), length 84)
10.10.244.14 > 10.10.10.4: ICMP echo request, id 34083, seq 1, length 64
[...]
It seems that the Windows server does not update its ARP table from the received ARP request, but sends out its own request to get the client's MAC address (as seen in the fourth packet above). These requests are broadcast and thus never reach the client. As a consequence, the pings cannot not be answered because the server does not know where to send them.
Here's the interesting part. If I keep pinging the server, after around two minutes, the ARP entry for the server on the client seems to go stale, a unicast ARP request is sent to the server to verify the cached MAC address, and from that moment on, the pings are answered:
15:13:57.289539 4c:32:75:95:b9:7b > 52:75:9a:65:57:e4, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.4 (52:75:9a:65:57:e4) tell 10.10.244.14, length 46
15:13:57.289945 52:75:9a:65:57:e4 > 4c:32:75:95:b9:7b, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 128, id 31608, offset 0, flags [none], proto ICMP (1), length 84)
10.10.10.4 > 10.10.244.14: ICMP echo reply, id 34083, seq 126, length 64
15:13:57.290001 52:75:9a:65:57:e4 > 4c:32:75:95:b9:7b, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Reply 10.10.10.4 is-at 52:75:9a:65:57:e4, length 46
15:13:58.292013 4c:32:75:95:b9:7b > 52:75:9a:65:57:e4, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 60751, offset 0, flags [none], proto ICMP (1), length 84)
10.10.244.14 > 10.10.10.4: ICMP echo request, id 34083, seq 127, length 64
15:13:58.292336 52:75:9a:65:57:e4 > 4c:32:75:95:b9:7b, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 128, id 31609, offset 0, flags [none], proto ICMP (1), length 84)
10.10.10.4 > 10.10.244.14: ICMP echo reply, id 34083, seq 127, length 64
To me this looks as if the Windows server only updates its own ARP cache upon receiving an ARP request for its own address when it's a unicast request.
If I understand the RFC correctly, this doesn't seem to be the correct behavior. I understand that a recipient of an ARP request should always update its own ARP table if there is already an entry for the sending host, or insert a new entry for the sending host if the ARP request is for its own address.
Am I misreading the spec, or does Windows do something different? If the latter, is this behavior configurable?
Thanks for any pointers.
This breaks any and all communication between the wired and the wireless segment. Broadcasts are required for ARP to work.
You can filter/block specific types of undesired broadcasts but not all of them. For a clean approach, you should use a dedicated wireless subnet (or several) and route between it and the rest.