I've a KVM system upon which I'm running a network bridge directly between all VM's and a bond0 (eth0, eth1) on the host OS. As such, all machines are presented on the same subnet, available outside of the box. The bond is doing mode 1 active / passive, with an arp_ip_target set to the default gateway, which has caused some issues in itself, but I can't see the bond configs mattering here myself.
I'm seeing odd things most times when I stop and start a guest on the platform, in that on the host I lose network connectivity (icmp, ssh) for about 30 seconds. I don't lose connectivity on the other already running VM's though... they can always ping the default GW, but the host can't. I say "about 30 seconds" but from some tests it actually seems to be 28 seconds usually (or at least, I lose 28 pings...) and I'm wondering if this somehow relates to the bridge config.
I'm not running STP on the bridge at all, and the forwarding delay is set to 1 second, path cost on the bond0 lowered to 10 and port priority of bond0 also lowered to 1. As such I don't think that the bridge should ever be able to think that bond0 is not connected just fine (as continued guest connectivity implies) yet the IP of the host, which is on the bridge device (... could that matter?? ) becomes unreachable.
I'm fairly sure it's about the bridged networking, but at the same time as this happens when a VM is started there are clearly loads of other things also happening so maybe I'm way off the mark.
Lack of connectivity:
# ping 10.20.11.254
PING 10.20.11.254 (10.20.11.254) 56(84) bytes of data.
64 bytes from 10.20.11.254: icmp_seq=1 ttl=255 time=0.921 ms
64 bytes from 10.20.11.254: icmp_seq=2 ttl=255 time=0.541 ms
type=1700 audit(1293462808.589:325): dev=vnet6 prom=256 old_prom=0 auid=42949672
95 ses=4294967295
type=1700 audit(1293462808.604:326): dev=vnet7 prom=256 old_prom=0 auid=42949672
95 ses=4294967295
type=1700 audit(1293462808.618:327): dev=vnet8 prom=256 old_prom=0 auid=42949672
95 ses=4294967295
kvm: 14116: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x130079
kvm: 14116: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffdd694a
kvm: 14116: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x530079
64 bytes from 10.20.11.254: icmp_seq=30 ttl=255 time=0.514 ms
64 bytes from 10.20.11.254: icmp_seq=31 ttl=255 time=0.551 ms
64 bytes from 10.20.11.254: icmp_seq=32 ttl=255 time=0.437 ms
64 bytes from 10.20.11.254: icmp_seq=33 ttl=255 time=0.392 ms
brctl output of relevant bridge:
# brctl showstp brdev
brdev
bridge id 8000.b2e1378d1396
designated root 8000.b2e1378d1396
root port 0 path cost 0
max age 19.99 bridge max age 19.99
hello time 1.99 bridge hello time 1.99
forward delay 0.99 bridge forward delay 0.99
ageing time 299.95
hello timer 0.50 tcn timer 0.00
topology change timer 0.00 gc timer 0.04
flags
vnet5 (3)
port id 8003 state forwarding
designated root 8000.b2e1378d1396 path cost 100
designated bridge 8000.b2e1378d1396 message age timer 0.00
designated port 8003 forward delay timer 0.00
designated cost 0 hold timer 0.00
flags
vnet0 (2)
port id 8002 state forwarding
designated root 8000.b2e1378d1396 path cost 100
designated bridge 8000.b2e1378d1396 message age timer 0.00
designated port 8002 forward delay timer 0.00
designated cost 0 hold timer 0.00
flags
bond0 (1)
port id 0001 state forwarding
designated root 8000.b2e1378d1396 path cost 10
designated bridge 8000.b2e1378d1396 message age timer 0.00
designated port 0001 forward delay timer 0.00
designated cost 0 hold timer 0.00
flags
I do see the new port listed as learning, but in line with the forward delay, only for 1 or 2 seconds when polling the brctl output on a loop.
ifconfig without sample VM:
bond0 Link encap:Ethernet HWaddr D4:85:64:65:FA:4E
inet6 addr: fe80::d685:64ff:fe65:fa4e/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:21168629 errors:0 dropped:0 overruns:0 frame:0
TX packets:9280285 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:8777768179 (8.1 GiB) TX bytes:2671736365 (2.4 GiB)
bradSP1 Link encap:Ethernet HWaddr 00:00:00:00:00:00
inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:36 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1656 (1.6 KiB) TX bytes:6592 (6.4 KiB)
brawSP1 Link encap:Ethernet HWaddr 00:00:00:00:00:00
inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:109 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4996 (4.8 KiB) TX bytes:6592 (6.4 KiB)
brdev Link encap:Ethernet HWaddr B2:E1:37:8D:13:96
inet addr:10.20.11.129 Bcast:10.20.11.255 Mask:255.255.255.0
inet6 addr: fe80::d685:64ff:fe65:fa4e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:16663718 errors:0 dropped:0 overruns:0 frame:0
TX packets:8800468 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:3268513274 (3.0 GiB) TX bytes:2587834869 (2.4 GiB)
brmgtSP1 Link encap:Ethernet HWaddr 1A:CA:AE:08:1C:42
inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:699322 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:928721301 (885.6 MiB) TX bytes:6706 (6.5 KiB)
eth0 Link encap:Ethernet HWaddr D4:85:64:65:FA:4E
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:20412120 errors:0 dropped:0 overruns:0 frame:0
TX packets:9280285 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8720799421 (8.1 GiB) TX bytes:2671736365 (2.4 GiB)
Interrupt:169 Memory:f4000000-f4012800
eth1 Link encap:Ethernet HWaddr D4:85:64:65:FA:4E
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:756509 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:56968758 (54.3 MiB) TX bytes:0 (0.0 b)
Interrupt:186 Memory:f2000000-f2012800
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:3937 errors:0 dropped:0 overruns:0 frame:0
TX packets:3937 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:6641553 (6.3 MiB) TX bytes:6641553 (6.3 MiB)
vnet0 Link encap:Ethernet HWaddr B2:E1:37:8D:13:96
inet6 addr: fe80::b0e1:37ff:fe8d:1396/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:59861 errors:0 dropped:0 overruns:0 frame:0
TX packets:5924530 errors:0 dropped:0 overruns:2 carrier:0
collisions:0 txqueuelen:500
RX bytes:6405635 (6.1 MiB) TX bytes:1987480170 (1.8 GiB)
vnet1 Link encap:Ethernet HWaddr 1A:CA:AE:08:1C:42
inet6 addr: fe80::18ca:aeff:fe08:1c42/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:541798 errors:0 dropped:0 overruns:0 frame:0
TX packets:61998 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:802746110 (765.5 MiB) TX bytes:6498514 (6.1 MiB)
ifconfig with sample VM:
bond0 Link encap:Ethernet HWaddr D4:85:64:65:FA:4E
inet6 addr: fe80::d685:64ff:fe65:fa4e/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:21285120 errors:0 dropped:0 overruns:0 frame:0
TX packets:9291457 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:8948482155 (8.3 GiB) TX bytes:2673235824 (2.4 GiB)
bradSP1 Link encap:Ethernet HWaddr 2A:18:E1:2D:1A:EC
inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:36 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1656 (1.6 KiB) TX bytes:6592 (6.4 KiB)
brawSP1 Link encap:Ethernet HWaddr 96:55:AA:14:67:07
inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:109 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4996 (4.8 KiB) TX bytes:6592 (6.4 KiB)
brdev Link encap:Ethernet HWaddr 16:5C:BC:E5:90:11
inet addr:10.20.11.129 Bcast:10.20.11.255 Mask:255.255.255.0
inet6 addr: fe80::d685:64ff:fe65:fa4e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:16673094 errors:0 dropped:0 overruns:0 frame:0
TX packets:8801611 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:3279365967 (3.0 GiB) TX bytes:2587927761 (2.4 GiB)
brmgtSP1 Link encap:Ethernet HWaddr 1A:CA:AE:08:1C:42
inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:699342 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:928723605 (885.6 MiB) TX bytes:6706 (6.5 KiB)
eth0 Link encap:Ethernet HWaddr D4:85:64:65:FA:4E
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:20528382 errors:0 dropped:0 overruns:0 frame:0
TX packets:9291457 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8891497316 (8.2 GiB) TX bytes:2673235824 (2.4 GiB)
Interrupt:169 Memory:f4000000-f4012800
eth1 Link encap:Ethernet HWaddr D4:85:64:65:FA:4E
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:756738 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:56984839 (54.3 MiB) TX bytes:0 (0.0 b)
Interrupt:186 Memory:f2000000-f2012800
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:3937 errors:0 dropped:0 overruns:0 frame:0
TX packets:3937 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:6641553 (6.3 MiB) TX bytes:6641553 (6.3 MiB)
vnet0 Link encap:Ethernet HWaddr B2:E1:37:8D:13:96
inet6 addr: fe80::b0e1:37ff:fe8d:1396/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:69818 errors:0 dropped:0 overruns:0 frame:0
TX packets:6034715 errors:0 dropped:0 overruns:2 carrier:0
collisions:0 txqueuelen:500
RX bytes:7763947 (7.4 MiB) TX bytes:2149238089 (2.0 GiB)
vnet1 Link encap:Ethernet HWaddr 1A:CA:AE:08:1C:42
inet6 addr: fe80::18ca:aeff:fe08:1c42/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:650557 errors:0 dropped:0 overruns:0 frame:0
TX packets:72519 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:964153780 (919.4 MiB) TX bytes:7896728 (7.5 MiB)
vnet2 Link encap:Ethernet HWaddr AA:4B:22:76:D2:EC
inet6 addr: fe80::a84b:22ff:fe76:d2ec/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:10521 errors:0 dropped:0 overruns:0 frame:0
TX packets:108765 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:1398214 (1.3 MiB) TX bytes:161408138 (153.9 MiB)
vnet3 Link encap:Ethernet HWaddr 96:55:AA:14:67:07
inet6 addr: fe80::9455:aaff:fe14:6707/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:0 (0.0 b) TX bytes:468 (468.0 b)
vnet4 Link encap:Ethernet HWaddr 2A:18:E1:2D:1A:EC
inet6 addr: fe80::2818:e1ff:fe2d:1aec/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:0 (0.0 b) TX bytes:468 (468.0 b)
vnet5 Link encap:Ethernet HWaddr 16:5C:BC:E5:90:11
inet6 addr: fe80::145c:bcff:fee5:9011/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:241 errors:0 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:500
RX bytes:0 (0.0 b) TX bytes:47167 (46.0 KiB)
All pointers, tips or stabs in the dark appreciated.
Can you please post
ifconfig -a
from before and after you start the VM?I'm running almost the same setup and having the same issues (also not using STP). When I'm running more VM's the issues are pretty much gone. The host stays available. Can you try to ping you're gateway from the host in the background and then try to reproduce the the issue by stopping/destroying a VM? This seems to work for me. I think that if the host or a VM on the host sends out signals on the same vlan or bridge triggers something to keep the host itself available.