I'm trying to set up VMware HA for the cluster, and having trouble understanding how resource monitoring works. We overcommit memory as a general practice, so out provisioned memory is always ~1.5 higher on individual VMs.
So I created a cluster with 2 Hosts in it, and one was ~90% full in terms of memory (and I mean used memory, as provisioned was ~140%). Second host ran no VMs. Tried powering on one VM - and I got an error, saying that would make it impossible to tolerate one host failure.
Reading more, I found that when this happens and you disable the policy to prevent power on, VMware will not guarantee failover for all hosts.
- But does it mean that it will just not try if it thinks that there is not enough resources?
- Or does it mean that something bad might happen because memory usage will go over all available, and it'll have to start swapping?
- How does it make such decisions?
This is normal behavior. From vCenter Server right click on Cluster > Edit Settings > VMware HA and check "Disable: Power on VMs that violate availability constraints." That would fix your issue.
Basically in 2 node HA cluster, when one node is down, VMs have no machine to failover if the second node also fails and you are disabling that check and that is normal for 2 node clusters.
If you had 3 or more nodes in a cluster, then you can keep that option enabled.