Ping a Specific Port

Question

Sreeraj

Asked: 2016-05-27 22:26:19 +0800 CST2016-05-27 22:26:19 +0800 CST 2016-05-27 22:26:19 +0800 CST

How does Heartbeat CRM identify the node to be shot down in a cluster?

772

I am trying to understand the Heartbeat setup in a new environment. It is a 2-node cluster that is still using Version 1 of Heartbeat (the one that does not use Pacemaker CRM) and I have a fundamental question that I could not find an easy to understand answer from google.

The question is, in case of a communication failure between the nodes in the cluster, but both the nodes still functioning well, how does the Cluster Manager identify which node is to be shot down? I see a ping_groupdirective in /etc/ha.d/ha.cf. From what I read, I see that the Cluster Manager will check the connectivity to any of the nodes mentioned in ping_group and checks the connection from which cluster node is alive and from that it decides which node to be shot down(?) What if connections from both the nodes to the ping nodes are alive and only the heartbeat network between both the nodes in the cluster is down? What am I missing here?

Situation: Only the heartbeat network is down, but both the nodes are UP and fine.

root@automan00:/root : cat /etc/ha.d/ha.cf
debugfile       /var/log/ha-debug
logfile         /var/log/ha-log
logfacility     local0
keepalive       500ms
deadtime        30
warntime        10
initdead        120
udpport         694
baud            19200
bcast           bond1 eth2
auto_failback   off
node            automan00
node            automan01
ping_group group1 1.1.1.1 2.2.2.2
respawn hacluster /usr/lib64/heartbeat/ipfail
realtime on

# stonith directive
stonith external/riloe /etc/ha.d/riloe.cfg

1 Answers

Voted

Marc Riera · Answer 1 · 2016-06-01T06:11:58+08:00

Marc Riera

2016-06-01T06:11:58+08:002016-06-01T06:11:58+08:00

Maybe you can set a crossover cable between the nodes with some private IP's as another private network on HB.

However: When communication failed between only 2 nodes you don't know which node to shoot down, this is why you need a third node before going to production.

Without the third node being able to leverage who is working properly and who is not you will find yourself with a Split Brain situation .

https://en.wikipedia.org/wiki/Split-brain_(computing)

It's not a good practice to have a kill myself tool, like a last man button or so, because you will never know what happens with the other node. If the comunication failed or the other host just went south , you will see the same behaviour, so you can not kill yourself in any of those cases. And the same goes for the other node point of view.

I know this is not a solution, but I hope it will help understand the way CRM works. If you build a cluster try to use more than 2 nodes, is that simple.

0

How does Heartbeat CRM identify the node to be shot down in a cluster?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?