Ping a Specific Port

Question

wazoox

Asked: 2024-09-09 23:00:26 +0800 CST2024-09-09 23:00:26 +0800 CST 2024-09-09 23:00:26 +0800 CST

Linux : trouble with bonding and MAC addresses

772

I had network problems lately (running Debian, but not specific to any distro, see below using direct /sys manipulation), and I discovered that the source of my woes was that two servers with bonded network interfaces had the same hardware address on their bonds. This MAC address is NOT one of the hardware interfaces' addresses, though it should be (according to most documentations such as this one) :

The bonding interface has a hardware address of 00:00:00:00:00:00 until the first slave is added. If the VLAN interface is created prior to the first enslavement, it would pick up the all-zeroes hardware address. Once the first slave is attached to the bond, the bond device itself will pick up the slave's hardware address, which is then available for the VLAN device.

Furthermore, contrary as what is stated in this documentation, an "empty" bond (without any slaves) hasn't a 00:00:00:00:00 hw address:

# modprobe bonding
# echo +bond0 > /sys/class/net/bonding_masters 
# ip link show bond0
3: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether d6:f5:8e:9f:c2:42 brd ff:ff:ff:ff:ff:ff

However that hardware addresses change to the slave's hardware address if and only the slave is added immediately to the bond and the bond address isn't checked. I test with a very simple script (see below) to manipulate the bond.

First it creates an empty bond without slaves, and displays its hw address (which is apparently pseudo random, but tends to stay the same -- maybe udev plays a role here?).
Second it creates the bond, sets its mode and adds a slave : the bond takes the slave's hw address as expected.
Third its creates the bond, reads its hw address, then sets the mode and adds a slave : the bond's hw address doesn't match its slave's hw address (but running the script again and again, from time to time it does! go figure).
Fourth it creates the bond, wait for one second, then configure it like step 2. The behaviour is mostly the same as step 3 : the bond has the same, pseudo random hw address as step 1, which always differs from the slave's hw address and is always the same.

I don't know exactly how the bond pseudo-random hw address is set up, but it happens to be relatively repetitive across several hardware and software configurations (different hardware, different kernel versions, different Debian releases).

Sometimes, at step 3 the bond's MAC address changes to different pseudo random values, sometimes it matches the slave's address. Most of the time however, it stays the same as in step 1. At step 4 the bond's MAC address is always the same as step one.

Apparently the bond's MAC address setting is very time sensitive?

Here is the script:

echo "unload / load bonding"
rmmod bonding
sleep 1
modprobe bonding
sleep 1
echo "create bond0"
echo +bond0 > /sys/class/net/bonding_masters
echo "bond0 hw address, no slaves:"
cat /sys/class/net/bond0/address
sleep 3
echo "################"
echo "unload / load bonding"
rmmod bonding
sleep 1
modprobe bonding
sleep 1
echo "create bond0 and configure it without delay"
echo +bond0 > /sys/class/net/bonding_masters
# cat /sys/class/net/bond0/address
echo 6 > /sys/class/net/bond0/bonding/mode
echo +enp1s0  > /sys/class/net/bond0/bonding/slaves
echo "Bond0 hw address:"
cat /sys/class/net/bond0/address
echo "enp1s0 hw address:"
ethtool -P enp1s0

echo "################"
sleep 3
echo "unload / load bonding"
rmmod bonding
sleep 1
modprobe bonding
sleep 1
echo "create bond0 and configure it, read its hw address first"
echo +bond0 > /sys/class/net/bonding_masters
cat /sys/class/net/bond0/address
echo 6 > /sys/class/net/bond0/bonding/mode
echo +enp1s0  > /sys/class/net/bond0/bonding/slaves
echo "Bond0 hw address:"
cat /sys/class/net/bond0/address
echo "enp1s0 hw address:"
ethtool -P enp1s0

echo "################"
sleep 3
echo "unload / load bonding"
rmmod bonding
sleep 1
modprobe bonding
sleep 1
echo "create bond0 and configure it after 1 second delay"
echo +bond0 > /sys/class/net/bonding_masters
# cat /sys/class/net/bond0/address
echo 6 > /sys/class/net/bond0/bonding/mode
sleep 1
echo +enp1s0  > /sys/class/net/bond0/bonding/slaves
echo "Bond0 hw address:"
cat /sys/class/net/bond0/address
echo "enp1s0 hw address:"
ethtool -P enp1s0

And here's its output :

unload / load bonding
create bond0
bond0 hw address, no slaves:
ea:dc:34:e6:7c:8d
################
unload / load bonding
create bond0 and configure it without delay
Bond0 hw address:
52:54:00:c8:76:09
enp1s0 hw address:
Permanent address: 52:54:00:c8:76:09
################
unload / load bonding
create bond0 and configure it, read its hw address first
d6:f5:8e:9f:c2:42
Bond0 hw address:
d6:f5:8e:9f:c2:42
enp1s0 hw address:
Permanent address: 52:54:00:c8:76:09
################
unload / load bonding
create bond0 and configure it after 1 second delay
Bond0 hw address:
d6:f5:8e:9f:c2:42
enp1s0 hw address:
Permanent address: 52:54:00:c8:76:09

From time to time, the bond's hw address changes to some other value. However, most of the time it falls back to the same one (here 'd6:f5:8e:9f:c2:42') and seems to cycle across a limited number of MAC addresses across reboots.

However the very serious problem is that different machines end up with the same pseudo random hardware address; when they're connected to the same network switch, chaos ensue. Actually checking across several different machines connected to different networks, at least 4 share the same bond MAC address (though as long as they're not connected together to the same switch, it's mostly harmless).

Notice that in that particular example I set up the bond in mode 6, but I had the problem on machine running in mode 4 (802.3ad) and other modes. This doesn't seem related to the bonding mode at all -- changing the more to 1 or 2 or 4 doesn't change the MAC address.

Of course I could force the bond's MAC address to some meaningful value using a if-up.d script or something similar, but I'd rather have something that works out of the box :)

1 Answers

Voted

M.D. Klapwijk · Answer 1 · 2025-01-24T20:53:46+08:00

There have been issues with the mac-addresses used in linux bonding (mode 4) for quite some time; in the past the mac of the first interface in the bond was used, which was fine when a new active aggregate was determined only upon complete failure/disconnection of the active aggregate.

That all changed when the code was changed to select a new active aggregate on every change on the bonded interfaces, which allowed for selecting the aggregate with the most bandwidth/links/etc on every change.

And this is where the problems start; imagine the following:

an 8-interface bond running 2 equal aggregates over 2 dumb switches, so aggregate 1 (eth0/3) is active and aggregate 2 (eth4/7) is passive/unused and the bond/aggregate is carrying the mac of the first interface (eth0) in the bond.
eth3 gets disconnected and the aggregate 2 will become active carrying the mac of the bond (=eth0).

All was fine on active aggregate 1, but after switching to aggregate 2, eth0 is still connected with its own mac address and aggregate 2 is also using this mac on the other switch! This because (mode 4/802.3ad/lacp) keeps them connected/monitored for state changes, resulting in around 20% or more package loss...

I've seen this happen on many locations and even with appliances like the NetApp Filers and what I tend to end up doing is setting the mac for the bond to a private mac address based on the mac of eth0, by just replacing the first octet with 02:

root@kc-host-01:~# ethtool -P eth0
Permanent address: 34:1a:4c:03:a0:95

root@kc-host-01:~# cat /etc/network/interfaces
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual

auto eth1
iface eth1 inet manual

auto eth2
iface eth2 inet manual

auto eth3
iface eth3 inet manual

...

auto bond0
iface bond0 inet manual
        bond-slaves eth0 eth1
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer2+3
        bond-downdelay 200
        bond-updelay 200
        up ip link set dev bond0 address 02:1a:4c:03:a0:95

auto vmbr0
iface vmbr0 inet static
        address 192.168.100.253/24
        gateway 192.168.100.254
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
...

This way the mac on the bond is unique and traceable/linkable to the actual hardware used.

I did notice some distributions recently started using/setting private mac's for bonded interfaces automagically, but I like to keep it under my own control...

Linux : trouble with bonding and MAC addresses

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?