I'm configuring a two node A/A cluster with a common storage attached via iSCSI, which uses GFS2 on top of clustered LVM. So far I have prepared a simple configuration, but am not sure which is the right way to configure gfs resource.
Here is the rm section of /etc/cluster/cluster.conf:
<rm>
<failoverdomains>
<failoverdomain name="node1" nofailback="0" ordered="0" restricted="1">
<failoverdomainnode name="rhc-n1"/>
</failoverdomain>
<failoverdomain name="node2" nofailback="0" ordered="0" restricted="1">
<failoverdomainnode name="rhc-n2"/>
</failoverdomain>
</failoverdomains>
<resources>
<script file="/etc/init.d/clvm" name="clvmd"/>
<clusterfs name="gfs" fstype="gfs2" mountpoint="/mnt/gfs" device="/dev/vg-cs/lv-gfs"/>
</resources>
<service name="shared-storage-inst1" autostart="0" domain="node1" exclusive="0" recovery="restart">
<script ref="clvmd">
<clusterfs ref="gfs"/>
</script>
</service>
<service name="shared-storage-inst2" autostart="0" domain="node2" exclusive="0" recovery="restart">
<script ref="clvmd">
<clusterfs ref="gfs"/>
</script>
</service>
</rm>
This is what I mean: when using clusterfs resource agent to handle GFS partition, it is not unmounted by default (unless force_unmount option is given). This way when I issue
clusvcadm -s shared-storage-inst1
clvm is stopped, but GFS is not unmounted, so a node cannot alter LVM structure on shared storage anymore, but can still access data. And even though a node can do it quite safely (dlm is still running), this seems to be rather inappropriate to me, since clustat
reports that the service on a particular node is stopped. Moreover if I later try to stop cman on that node, it will find a dlm locking, produced by GFS, and fail to stop.
I could have simply added force_unmount="1", but I would like to know what is the reason behind the default behavior. Why is it not unmounted? Most of the examples out there silently use force_unmount="0", some don't, but none of them give any clue on how the decision was made.
Apart from that I have found sample configurations, where people manage GFS partitions with gfs2 init script - https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Defining_The_Resources
or even as simply as just enabling services such as clvm and gfs2 to start automatically at boot (http://pbraun.nethence.com/doc/filesystems/gfs2.html), like:
chkconfig gfs2 on
If I understand the latest approach correctly, such cluster only controls whether nodes are still alive and can fence errant ones, but such cluster has no control over the status of its resources.
I have some experience with Pacemaker and I'm used to that all resources are controlled by a cluster and an action can be taken when not only there are connectivity issues, but any of the resources misbehave.
So, which is the right way for me to go:
- leave GFS partition mounted (any reasons to do so?)
- set force_unmount="1". Won't this break anything? Why this is not the default?
- use script resource
<script file="/etc/init.d/gfs2" name="gfs"/>
to manage GFS partition. - start it at boot and don't include in cluster.conf (any reasons to do so?)
This may be a sort of question that cannot be answered unambiguously, so it would be also of much value for me if you shared your experience or expressed your thoughts on the issue. How does for example /etc/cluster/cluster.conf look like when configuring gfs with Conga or ccs (they are not available to me since for now I have to use Ubuntu for the cluster)?
Thanks you very much!
I have worked a little with clusters. These are my opinions on the subject.
If you choose to configure gfs as a clustered resource, and add the
clvmd
andgfs
disk as resources, then when you failover withrgmanager
it will try to umount the disk, so what I'd do in your case first is check logs (orlsof
/fuser
etc) for an indication why the umounting might have failed. Likely there is a process having a file open or something like that, preventing a "clean" umount.Can it be because you don't use rgmanager to start your clustered application? I don't see it in your cluster.conf. That would if true explain the behaviour.
If you choose to
force_unmount
, what rgmanager will do when failing over/recovering is forcefully killing any recourse using the disk before umounting the disk. Weather that is a good idea or not depends.If you want to change LVM structure in this scenario you can start the clvmd daemon again manually. if you umount the gfs disk before stopping cman, that should work. On the other hand, in a production scenario I rarely find myself in a situation where I'd want to stop CMAN on a clustered node.
My preference is to go with option 4.
It is true that if you don't add
gfs2
andclvmd
resource as a cluster resource,rgmanager
won't be able to control it. What I usually do when setting upp A/A clusters (depending on the case of course) is that I'd add the start script for my service as the clustered resource. (rgmanager will then call the script withstatus
argument on a regular basis to determine weather it needs to take configured action). Since my script has a dependency on the gfs file system it will fail unless it is mounted.The 4 approach implies manually enabeling
clvmd
,cman
andgfs2
(and possibly other daemons too depending on the situation).Since the GFS fileystem sits on top of a iSCSI device, adding
_netdev
option to the mount in/etc/fstab
is a requirement for it to work.rgmanager
There are a few disadvantages I can think of too:
updatedb
and other jobs which might want to traverse the filesystem, thereby causing drive latency (locking traffic)No matter what you decide
I would add the init script as a clustered resource, and if you chose to add
gfs
andclvm
to the cluster as resources, I'd consider adding the __independent_subtree attribute to it, so if it fails, rgmanager won't re-mount the gfs filesystem. This depends of course on your particular situation. Note the nested configuration in the link, marking a sort of dependency tree.