We had a power outage at our data center last week and when our dual PIX 515E running IOS 7.0(8) (configured with a failover cable) came back, they were in a failed over state where the Secondary unit is active and the Primary unit is standby I have tried 'failover reset', 'failover active', and 'failover reload-standby' as well as executing reloads on both units in a variety of orders, and they don't come back Primary/Active Secondary/Standby. The only thing in my arsenal that I haven't tried is driving to the data center and performing a hard reboot, which I hate to do.
I have read How Failover Works on the Cisco Secure Firewall and it seems like this should be wicked straight forward.
output of show failover
on Primary:
Failover On
Cable status: Normal
Failover unit Primary
Failover LAN Interface: N/A - Serial-based failover enabled
Unit Poll frequency 15 seconds, holdtime 45 seconds
Interface Poll frequency 15 seconds
Interface Policy 1
Monitored Interfaces 2 of 250 maximum
Version: Ours 7.0(8), Mate 7.0(8)
Last Failover at: 02:52:05 UTC Mar 10 2010
This host: Primary - Standby Ready
Active time: 0 (sec)
Interface outside (x.x.x.165): Normal
Interface inside (y.y.y.3): Normal
Other host: Secondary - Active
Active time: 897045 (sec)
Interface outside (x.x.x.164): Normal
Interface inside (y.y.y.4): Normal
Stateful Failover Logical Update Statistics
Link : Unconfigured.
output of show failover
on Secondary:
Failover On
Cable status: Normal
Failover unit Secondary
Failover LAN Interface: N/A - Serial-based failover enabled
Unit Poll frequency 15 seconds, holdtime 45 seconds
Interface Poll frequency 15 seconds
Interface Policy 1
Monitored Interfaces 2 of 250 maximum
Version: Ours 7.0(8), Mate 7.0(8)
Last Failover at: 02:03:04 UTC Feb 28 2010
This host: Secondary - Active
Active time: 896925 (sec)
Interface outside (x.x.x.164): Normal
Interface inside (y.y.y.4): Normal
Other host: Primary - Standby Ready
Active time: 0 (sec)
Interface outside (x.x.x.165): Normal
Interface inside (y.y.y.3): Normal
Stateful Failover Logical Update Statistics
Link : Unconfigured.
I'm seeing the following in my syslog:
Mar 10 03:05:00 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command.
Mar 10 03:05:09 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reload-standby' command.
Mar 10 03:05:12 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=20,my=Active,peer=Failed.
Mar 10 03:05:12 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Failed.
Mar 10 03:06:09 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=0,my=Active,peer=Failed.
Mar 10 03:06:09 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is down.
Mar 10 03:06:09 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=1,my=Active,peer=Failed.
Mar 10 03:06:10 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is up.
Mar 10 03:06:10 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=411,op=2,my=Active,peer=Failed.
Mar 10 03:06:23 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=80,my=Active,peer=Standby Ready.
Mar 10 03:06:23 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Standby Ready.
Mar 10 03:06:24 fw2 %PIX-6-720027: (VPN-Primary) HA status callback: My state Standby Ready.
Mar 10 03:07:05 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command.
Mar 10 03:07:31 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover active' command.
Mar 10 03:08:04 fw1 %PIX-5-611103: User logged out: Uname: enable_1
Mar 10 03:08:04 fw1 %PIX-6-315011: SSH session from admin1_int on interface inside for user "pix" terminated normally
Mar 10 03:08:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=20,my=Active,peer=Failed.
Mar 10 03:08:39 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Failed.
Mar 10 03:09:10 fw1 %PIX-6-605005: Login permitted from admin1_int/36891 to inside:192.168.4.4/ssh for user "pix"
Mar 10 03:09:23 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command.
Mar 10 03:09:38 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=0,my=Active,peer=Failed.
Mar 10 03:09:39 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is down.
Mar 10 03:09:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=1,my=Active,peer=Failed.
Mar 10 03:09:39 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is up.
Mar 10 03:09:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=411,op=2,my=Active,peer=Failed.
Mar 10 03:09:52 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=80,my=Active,peer=Standby Ready.
Mar 10 03:09:52 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Standby Ready.
Mar 10 03:09:53 fw2 %PIX-6-720027: (VPN-Primary) HA status callback: My state Standby Ready.
I'm not exactly sure how to interpret that syslog data. Primary doesn't seem to even try to become Active. When I reload the individual units separately, my connections are retained, so it doesn't seem like I have a real hardware failure. Is there something I can query (IOS or SNMP) to check for hardware issues?
Any thoughts? My IOS-fu is weak.
Thanks for any help you might provide, Aaron
Please DO NOT use the
no failover
command as mentioned by natacado. Instead, use theno failover active
command on the secondary (currently active) firewall. The first command turns off failover; the second command relinquishes active status to the other firewall in the HA pair. If you runfailover active
, please run it on the primary (currently standby) firewall.I don't believe the PIX provides a facility to allow automatic preemption when the primary firewall is ready to process traffic again.
Please post your failover configuration ("show run failover"). Or try to enable preemption (you will need to specify manually which unit is primary and whis is secondary).
At least with ASA5500 series units, what you want is to run the following on VPN-Primary:
no failover
This should also work on PIXes with relatively recent OSes. Essentially, think of
failover
as a command that tells the units to try to make the secondary be the active unit, and like many configuration commands,no failover
removes the action.FWIW, the only way we were able to resolve this issue was by physically powering down both firewalls, then bringing them back up in the correct order. None of the suggestions above were able to resolve the issue for me. Thanks to everyone for your time and help, though.