I installed keepalived on two firewalls to provide fail over. I'm not sure if the following configurations are correct (see configurations below).
Sometimes I have problems to reach the websites which are behind the firewalls. I suspect that when keepalived runs on both firewalls, for a period of approximately one minute, the websites remain unreachable.. then the connection to the websites is recovered.
What could be the problem? Can it be that the keepalived are switching state (MASTER or SLAVE) constantly?
Firewall-2 runs in MASTER state. When keepalived is started on firewall-1 it jumps into BACKUP state.
Are there commands or tools like ipvsadm
to check the real state of keepalived?
Configuration keepalived.conf
on firwall-1
root@firewall-1:/etc/keepalived# head -n100 keepalived.conf
global_defs {
router_id fw_1
}
vrrp_sync_group loadbalancers {
group {
extern
intern
}
}
vrrp_instance extern {
state BACKUP
priority 100
interface eth0.100
garp_master_delay 5
virtual_router_id 40
advert_int 1
authentication {
auth_type AH
auth_pass xxxx
}
virtual_ipaddress {
194.xx.xx.x1
194.xx.xx.x2
194.xx.xx.x3
194.xx.xx.xx
194.xx.xx.xx
194.xx.xx.x7
}
}
vrrp_instance intern {
state BACKUP
priority 100
notify "/usr/local/sbin/restart_pound"
interface eth0.200
garp_master_delay 5
virtual_router_id 41
advert_int 1
authentication {
auth_type AH
auth_pass xxxx
}
virtual_ipaddress {
192.168.100.1
192.168.100.10
}
}
..........
..........
..........
Configuration keepalived.conf
on firewall-2
root@firewall-2:/opt# head -n100 /etc/keepalived/keepalived.conf
global_defs {
router_id fw_2
}
vrrp_sync_group loadbalancers {
group {
extern
intern
}
}
vrrp_instance extern {
state MASTER
priority 200
interface eth1
garp_master_delay 5
virtual_router_id 40
advert_int 1
authentication {
auth_type AH
auth_pass xxxx
}
virtual_ipaddress {
194.xx.xx.x1
194.xx.xx.x2
194.xx.xx.x3
194.xx.xx.xx
194.xx.xx.xx
194.xx.xx.x7
}
}
vrrp_instance intern {
state MASTER
priority 200
notify "/usr/local/sbin/restart_pound"
interface eth0.200
garp_master_delay 5
virtual_router_id 41
advert_int 1
authentication {
auth_type AH
auth_pass xxxx
}
virtual_ipaddress {
192.168.100.1
192.168.100.10
}
}
........
........
You asked about commands or tools to check the real state of
keepaived
.Probably the best is to use:
You should see periodic messages from the master for all virtual router ids (vrid in the trave).
I tried using
tcpdump -i <interface> 'ip proto 112'
and found that, unless I was on a keepalived system, it wouldn't be seen. I had to become a member of the multicast group withip maddr add <multicast address>
before tcpdump would report the multicast. If you are using unicast this isn't a problem.Since my question I have found some things which may be helpful to others. First, my experience is that keepalived starts in MASTER state regardless of configuration and transitions to its "steady state" in a few seconds. This is critical if you're trying to run scripts on state change which affect the keepalived system, you may find both notify_master and the "steady state" notify_... script running simultaneously and colliding with one another.
Second, on newer systems
systemctl status keepalived
may show state if run soon enough after a state change (and intervening events haven't "scrolled it off").kill -USR1 <pid of keepalived>
will create /tmp/keepalived.data which reports keepalived's state and this is reliable if run after "steady state" is achieved. Using this method was my solution to the colliding script problem - sleep long enough to achieve steady state then usekill ...
followed by examining the file.