Ping a Specific Port

Question

quanta

Asked: 2012-06-06 03:53:12 +0800 CST2012-06-06 03:53:12 +0800 CST 2012-06-06 03:53:12 +0800 CST

IP failover with 2 nodes on different subnet: cannot ping virtual IP from second node?

772

I'm going to setup redundant failover Redmine:

another instance was installed on the second server without problem
MySQL (running on the same machine with Redmine) was configured as master-master replication

Because they are in different subnet (192.168.3.x and 192.168.6.x), it seems that VIPArip is the only choice.

/etc/ha.d/ha.cf on node1

logfacility none
debug 1
debugfile /var/log/ha-debug
logfile /var/log/ha-log
autojoin none
warntime 3
deadtime 6
initdead 60
udpport 694
ucast eth1 node2.ip
keepalive 1
node node1
node node2
crm respawn

/etc/ha.d/ha.cf on node2:

logfacility none
debug 1
debugfile /var/log/ha-debug
logfile /var/log/ha-log
autojoin none
warntime 3
deadtime 6
initdead 60
udpport 694
ucast eth0 node1.ip
keepalive 1
node node1
node node2
crm respawn

crm configure show:

node $id="6c27077e-d718-4c82-b307-7dccaa027a72" node1
node $id="740d0726-e91d-40ed-9dc0-2368214a1f56" node2
primitive VIPArip ocf:heartbeat:VIPArip \
        params ip="192.168.6.8" nic="lo:0" \
        op start interval="0" timeout="20s" \
        op monitor interval="5s" timeout="20s" depth="0" \
        op stop interval="0" timeout="20s" \
        meta is-managed="true"
property $id="cib-bootstrap-options" \
        stonith-enabled="false" \
        dc-version="1.0.12-unknown" \
        cluster-infrastructure="Heartbeat" \
        last-lrm-refresh="1338870303"

crm_mon -1:

============
Last updated: Tue Jun  5 18:36:42 2012
Stack: Heartbeat
Current DC: node2 (740d0726-e91d-40ed-9dc0-2368214a1f56) - partition with quorum
Version: 1.0.12-unknown
2 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ node1 node2 ]

 VIPArip    (ocf::heartbeat:VIPArip):   Started node1

ip addr show lo:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet 192.168.6.8/32 scope global lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever

I can ping 192.168.6.8 from node1 (192.168.3.x):

# ping -c 4 192.168.6.8
PING 192.168.6.8 (192.168.6.8) 56(84) bytes of data.
64 bytes from 192.168.6.8: icmp_seq=1 ttl=64 time=0.062 ms
64 bytes from 192.168.6.8: icmp_seq=2 ttl=64 time=0.046 ms
64 bytes from 192.168.6.8: icmp_seq=3 ttl=64 time=0.059 ms
64 bytes from 192.168.6.8: icmp_seq=4 ttl=64 time=0.071 ms

--- 192.168.6.8 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
rtt min/avg/max/mdev = 0.046/0.059/0.071/0.011 ms

but cannot ping virtual IP from node2 (192.168.6.x) and outside. Did I miss something?

PS: you probably want to set IP2UTIL=/sbin/ip in the /usr/lib/ocf/resource.d/heartbeat/VIPArip resource agent script if you get something like this:

Jun 5 11:08:10 node1 lrmd: [19832]: info: RA output: (VIPArip:stop:stderr) 2012/06/05_11:08:10 ERROR: Invalid OCF_RESK EY_ip [192.168.6.8]

http://www.clusterlabs.org/wiki/Debugging_Resource_Failures

Reply to @DukeLion:

Which router receives RIP updates?

When I start the VIPArip resource, ripd was run with below configuration file (on node1):

/var/run/resource-agents/VIPArip-ripd.conf:

hostname ripd
password zebra
debug rip events
debug rip packet
debug rip zebra
log file /var/log/quagga/quagga.log
router rip
!nic_tag
 no passive-interface lo:0
 network lo:0
 distribute-list private out lo:0
 distribute-list private in lo:0
!metric_tag
 redistribute connected metric 3
!ip_tag
access-list private permit 192.168.6.8/32
access-list private deny any

show ip route:

Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, A - Babel,
       > - selected route, * - FIB route

K>* 0.0.0.0/0 via 192.168.3.1, eth1
C>* 127.0.0.0/8 is directly connected, lo
K>* 169.254.0.0/16 is directly connected, eth1
C>* 192.168.3.0/24 is directly connected, eth1
C>* 192.168.6.8/32 is directly connected, lo

sh ip rip status:

Routing Protocol is "rip"
  Sending updates every 30 seconds with +/-50%, next due in 7 seconds
  Timeout after 180 seconds, garbage collect after 120 seconds
  Outgoing update filter list for all interface is not set
    lo:0 filtered by private
  Incoming update filter list for all interface is not set
    lo:0 filtered by private
  Default redistribution metric is 1
  Redistributing: connected
  Default version control: send version 2, receive any version 
    Interface        Send  Recv   Key-chain
  Routing for Networks:
    lo:0
  Routing Information Sources:
    Gateway          BadPackets BadRoutes  Distance Last Update
  Distance: (default is 120)

1 Answers

Voted

DukeLion · Answer 1 · 2012-06-07T06:29:53+08:00

I think problem is not in cluster configuration, but in your routing architecture.

VIPArip resource agent manages local quagga to send routing updates. But you also need to use this routing updates to change routes to point to active server. I'll try to explain how it works.

RIP HA

Look at the picture. HA1 and HA2 is linux-ha cluster members with quagga running. Blue router listens to RIP from both network links.

When vip goes up on HA1, quagga sends RIP update to the blue router. It adds vip prefix to it's routing table with 192.168.1.2 nexthop.

When failover occurs, vip goes down on HA1 and quagga stops completely, so updates won't be sent. Blue router will remove routing table record after timeout, even if VIP won't go up on HA2. And when VIP goes up on HA2 it starts quagga and it will send RIP updates. Blue router will add record to the routing table with 192.168.2.2 nexthop.

It is possible to use viparip in more complex network topology, just make sure your border routers get routing updates throughout your network.

IP failover with 2 nodes on different subnet: cannot ping virtual IP from second node?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?