Host: 192.168.1.144/24
VMs(routed) network: 192.168.122.0/24
I have VMs connected to libvirt's routed network.
<network>
<name>routed-122</name>
<uuid>86ca64a6-7fea-4cf8-9625-fa45fe944c2c</uuid>
<forward mode='route'/>
<bridge name='virbr1' zone='public' stp='on' delay='0'/>
<mac address='52:54:00:0e:84:40'/>
<domain name='routed_nat'/>
<ip address='192.168.122.1' netmask='255.255.255.0'>
<dhcp>
<range start='192.168.122.128' end='192.168.122.254'/>
<host mac='52:54:00:32:a9:9d' name='kvm-srv01' ip='192.168.122.131'/>
<host mac='52:54:00:82:b7:f7' name='kvm-srv02' ip='192.168.122.132'/>
<host mac='52:54:00:ee:38:54' name='kvm-srv03' ip='192.168.122.133'/>
</dhcp>
</ip>
</network>
UDP works successfully between:
VMs <--> bridge virbr1(192.168.122.1)
VMs <--> host external network (192.168.1.0/24)
UDP does not work between:
VMs <--> Host 192.168.1.144/24
At the same time, TCP works without problems!
I check the operation of UDP in this way (via iperf3 -u, which means UDP mode):
- on HOST: iperf3 -s (server mode)
- on VM: iperf3 -u -c HOST-IP
Or vice versa - the result is the same, no communication via UDP.
The dump via Wireshark shows the following.
Tried connecting VMs via default NAT network (192.168.100.0/24) - same issue.
Ran into this while experimenting with docker swarm.
This is a routing issue of the multi-homed host, specifically affecting UDP and not TCP.
From the screenshot, VM's iperf3 client sends UDP packet:
and expects reply from host's iperf3 server like this:
but instead host replies with:
As the source doesn't match what is expected (the client UDP socket is connect(2)-ed to 192.168.1.144:5021) the VM's kernel sends an ICMP port unreachable back to the host's iperf3 server.
This is a known issue for multi-homed unaware UDP applications using the BSD socket API: as the UDP socket is usually bound to 0.0.0.0 (INADDR_ANY) it doesn't know on what address among multiple possible addresses it received an UDP packet (this piece of information is not available by default). When replying with the source address on the socket: 0.0.0.0 (INADDR_ANY), it relies on the routing stack to fill in the actual source address from the route. This specifically fails when the packet was received on an other interface (eg: vmbr1) than where the IP address was added (eg: eth0).
TCP is not affected because once the connection is accept(2)-ed, and a new socket is created in established state, this new socket is automatically bound to the correct local address: the destination of the initial request, even when the listening socket was bound to INADDR_ANY.
There are two methods to overcome this with UDP:
never bind to INADDR_ANY but only to specific addresses, possibly multiple times.
The reply will always come from the bound address and will thus always have a correct source IP address.
or use the advanced feature provided by the socket option
IP_PKTINFO
(*BSD provides an equivalent feature withIP_RECVDSTADDR
).... which allows the application to know for each received UDP packet on what local address it was received, (as well as on what interface) and use this to send back the correct reply, while using a single UDP socket (usually) bound to INADDR_ANY. This requires the application to have additional specific code for proper handling.
Here, solution 1. is good enough with a recent enough iperf3: use option
--bind
to specify the address to bind the (TCP and when client requests UDP mode also the) UDP socket to. All replies will then come from the correct source address:If the VM is also multi-homed, the same kind of option can be used with the client, for example:
Note: actually my version of iperf3 binds by default to :: (in IPv6 dual-stack mode) rather than (using IPv4 and binding to) 0.0.0.0, but that's the same behavior.