My shop is putting together a three-node cluster for use as a Hyper-V host for our customer. Previous installations we did for this customer only had two nodes per cluster, so we did Node and Disk Majority. Easy enough. But now that we're moving on to an odd number of nodes, it has put us in a quandry: do what Microsoft recommends and use Node Majority or go with No Majority: Disk Only?
I wanted to ask the community:
- Has anyone has used No Majority in an operational environment?
- If so, is it something you would recommend?
The reason I ask, is that using the recommended settings we could only tolerate one server failure versus two. The flip side, is that No Majority would result in a "single" point of failure (which is something I think we can mitigate). My customer would really like to have this cluster up, even if only one node is available. To give proper context to the questions, here is our physical configuration:
- Our three servers have multiple NICs, with two on each dedicated exclusively to SAN traffic
- The servers have Device Specific Modules installed to support multipathing SAN traffic
- We have two iSCSI SAN appliances, each running 8 disks in a RAID-5 configuration with two way replication (data is mirrored between the two). Each has two bonded NICs.
- The SANs are full-mesh connected to the servers via two dedicated switches
- The quorum disk is on a dedicated LUN and used only for that purpose
- VHDs will be stored on the SAN as well, so if we don't have access to it at all, there's really no need for quorum (since that's the purpose of this cluster)
- The number of VM guests will be limited, so that they can all be hosted on one server at the same time without overloading it
Everything I've seen online is saying the same thing about this configuration being dangerous, but I don't know if they're just parroting or have actually validated the information. I guess I'm looking for reassurance from the community that I'm making the correct choice, since I'm deviating from what Microsoft recommends.
When you have a failover cluster with an even number of nodes, but your network isn't redundant, you still have a single point of failure. But it sounds like you've a fully redundant network setup, in that case, I would go with node majority.
With node majority, your failover cluster will at least still be available even if the SAN appliances completely died. If you do no majority, you have to have the utmost confidence in that disk availability, and I would place more faith in much dumber, simpler switches than mirrored SANs. Besides—if your redundant network dies, it won't matter if the SAN is fine.