I have keepalived
setup (floating VIP) in front of haproxy
on each of my three-node galera
cluster nodes. When I restart keepalived
on any given node, sometimes I end up with two nodes running in MASTER (as evidenced by the /etc/keepalived/log_status.sh
notify script):
# cat /etc/keepalived/log_status.sh
#!/bin/bash
echo $1 $2 is in $3 state > /var/run/keepalive.$1.$2.state
From what I've read, the 'multiple masters' is due to Multicast being filtered on the switch but I can run tcpdump on any one of my galera nodes and see the MC traffic hitting the nic (these are KVM virtuals running). I can try changing to unicast, but would like to know if this is due to a bug, feature or my config.
# cat /etc/keepalived/keepalived.conf
log "setting up keepalived"
global_defs {
router_id host1 # short hostname of each KA node (10.20.18.201-203)
}
vrrp_script check_haproxy {
script "pidof haproxy"
interval 2
weight 2
}
vrrp_instance 250 {
virtual_router_id 250
advert_int 1
nopreempt
priority 100
state BACKUP
interface eth0
notify /etc/keepalived/log_status.sh
virtual_ipaddress {
10.20.18.250 dev eth0
}
track_script {
check_haproxy
}
}
tcpdump output:
09:44:00.934942 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:01.936054 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:02.937315 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:03.938444 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:04.942302 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:05.373224 IP 10.20.18.201 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:05.943936 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:06.029216 IP 10.20.18.201 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:06.385127 IP 10.20.18.201 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:06.945303 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:07.333210 IP 10.20.18.201 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:07.946098 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:08.947228 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:09.948507 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:10.548023 IP 10.20.18.202 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:10.663961 IP 10.20.18.202 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:10.949633 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:11.559970 IP 10.20.18.202 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:11.587980 IP 10.20.18.202 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:11.950795 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:12.952124 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:13.953075 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:14.953543 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:15.954703 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:15.987641 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 0, authtype none, intvl 1s, length 20
09:44:15.992698 IP 10.20.18.203 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:16.008817 IP 10.20.18.203 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:17.008829 IP 10.20.18.203 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:17.036879 IP 10.20.18.203 > 224.0.0.22: igmp v3 report, 1 group record(s)
09:44:20.613407 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:21.615616 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:22.616909 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:23.618155 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
09:44:24.619607 IP 10.20.18.203 > 224.0.0.18: VRRPv2, Advertisement, vrid 250, prio 102, authtype none, intvl 1s, length 20
Single word answer: iptables.
I was running two instances of keepalived - one to allow access from inside networks and the other to support external access.
I copied the internal configuration to create an external keepalived instance. While keepalived was working properly on the first interface ( the internal one, eth0 ) , my copied config was producing a VIP on both hosts.
My review of tcpdump showed that the bcast VRRP traffic was allowed in the network and visible to both keepalived instances. I reviewed tcp traffic on both the internal and external interfaces ( eth0 internal / eth1 external ) .
VRRP traffic must be allowed. I found that I could sniff the traffic successfully and saw VRRP traffic from both of my keepalived instances with the correct (and different) priorities. However, my iptables configuration was only allowing traffic on eth1.
The relevant lines in /etc/sysconfig/iptabes:
Before (problems on keepalived on eth1 but eth0 OK):
After ( all good ) :
I had a similar configuration working just fine on a Vagrant local environment, but when configuring in a cloud provider servers, the BACKUP was always promoting itself and I would have 2 MASTERS at the same time.
I tried changing firewall rules, but what did it for me was entering the private network interface in the
vrrp_instance
interface
field, along with addingunicast_src_ip
andunicast_peer
blocks.unicast_src_ip
having the server own IP address andunicast_peer
with the addresses of the remaining Keepalived nodes.