For testing purposes, I have brought up 2 Debian jessie VMs (using vagrant and virtualbox) and had them both running a mini web server and configures with keepalived.
The problem is that the BACKUP server transitions a few seconds after service keepalived restart
to MASTER for no apparent reason.
I am also seeing VRRP multicast packets arriving on both servers and they look as expected. (which seems to mean that the solution suggested in Keepalived unwanted transition to master is not related)
The configurations files are as follows:
MASTER server:
global_defs {
lvs_id tom_lvs
}
vrrp_instance tom_lvs {
state MASTER
interface eth1
virtual_router_id 1
lvs_sync_daemon_interface eth1
priority 100
preempt
authentication {
auth_type PASS
auth_pass 1234
}
advert_int 1
virtual_ipaddress {
172.28.128.10/24
}
virtual_server 172.28.128.10 3000 {
delay_loop 10
lb_algo wlc
lb_kind DR
protocol TCP
persistence_timeout 1800
sorry_server 172.28.128.3 3000
real_server 172.28.128.4 3000 {
weight 5
HTTP_GET {
url {
path /index.html
}
}
}
}
}
BACKUP server:
global_defs {
lvs_id tom_lvs
}
vrrp_instance tom_lvs {
state BACKUP
interface eth1
virtual_router_id 2
lvs_sync_daemon_interface eth1
priority 96
preempt
authentication {
auth_type PASS
auth_pass 1234
}
advert_int 1
virtual_ipaddress {
172.28.128.10/24
}
virtual_server 172.28.128.10 3000 {
delay_loop 10
lb_algo wlc
lb_kind DR
protocol TCP
persistence_timeout 1800
sorry_server 172.28.128.3 3000
real_server 172.28.128.3 3000 {
weight 5
HTTP_GET {
url {
path /index.html
}
}
}
}
}
logs on the backup server that shows the transition from BACKUP to MASTER:
Feb 14 13:22:44 jessie Keepalived_vrrp[1145]: VRRP_Instance(tom_lvs) sending 0 priority
Feb 14 13:22:44 jessie Keepalived_vrrp[1176]: Registering Kernel netlink reflector
Feb 14 13:22:44 jessie Keepalived_vrrp[1176]: Registering Kernel netlink command channel
Feb 14 13:22:44 jessie Keepalived_vrrp[1176]: Registering gratuitous ARP shared channel
Feb 14 13:22:44 jessie Keepalived_vrrp[1176]: Opening file '/etc/keepalived/keepalived.conf'.
Feb 14 13:22:44 jessie Keepalived_healthcheckers[1175]: Registering Kernel netlink reflector
Feb 14 13:22:44 jessie Keepalived_vrrp[1176]: Configuration is using : 62104 Bytes
Feb 14 13:22:44 jessie Keepalived_vrrp[1176]: Using LinkWatch kernel netlink reflector...
Feb 14 13:22:44 jessie Keepalived_healthcheckers[1175]: Registering Kernel netlink command channel
Feb 14 13:22:44 jessie Keepalived_healthcheckers[1175]: Opening file '/etc/keepalived/keepalived.conf'.
Feb 14 13:22:44 jessie Keepalived_healthcheckers[1175]: Configuration is using : 12103 Bytes
Feb 14 13:22:44 jessie Keepalived_vrrp[1176]: VRRP_Instance(tom_lvs) Entering BACKUP STATE
Feb 14 13:22:44 jessie Keepalived_healthcheckers[1175]: Using LinkWatch kernel netlink reflector...
Feb 14 13:22:44 jessie Keepalived_healthcheckers[1175]: Activating healthchecker for service [172.28.128.3]:3000
Feb 14 13:22:48 jessie Keepalived_vrrp[1176]: VRRP_Instance(tom_lvs) Transition to MASTER STATE
Feb 14 13:22:49 jessie Keepalived_vrrp[1176]: VRRP_Instance(tom_lvs) Entering MASTER STATE
and this is what I see on both machines in tshark
Capturing on 'eth1'
1 0.000000 172.28.128.4 -> 224.0.0.18 VRRP 60 Announcement (v2)
2 0.001404 172.28.128.3 -> 224.0.0.18 VRRP 54 Announcement (v2)
3 1.002283 172.28.128.4 -> 224.0.0.18 VRRP 60 Announcement (v2)
4 1.004130 172.28.128.3 -> 224.0.0.18 VRRP 54 Announcement (v2)
5 2.004466 172.28.128.4 -> 224.0.0.18 VRRP 60 Announcement (v2)
6 2.006260 172.28.128.3 -> 224.0.0.18 VRRP 54 Announcement (v2)
I came across this question while researching a similar behaviour so I thought I'd post the answer for the benefit of anyone else who is similarly puzzled.
Both config files need to use the same value for the virtual_router_id, as this is how keepalived knows that status messages are for the same virtual router.
In the config above, they have different values, and are therefore operating independently.