Alright, so I want to start leveraging my SAN a little more than I have been, and at the same time, take advantage of ESXi.
Currently, I've got an array of Dell PowerEdge 1955 blades connected to a single-enclosure EMC AX4-5 FC storage array. I'm essentially using the SAN as DAS. I've got LUNs on the SAN that are pointing at specific physical machines, and those machines utilize the LUNs for whatever (mostly databases and Samba/NFS shares, depending on the target server).
I've got multiple physical file servers, and each one has a samba config setup to serve the appropriate shares. Since I never got RHCS to work, only one of the file servers has the LUNs mounted at a time. In the event that a fileserver dies, I fence it manually (either by unmounting and unpresenting the drive, using the navisphere utility, or by killing the power via DRAC) then use the navisphere utility to bring up the presented LUNs on the next contender (after which, start apache and the other daemons). All by hand, right now.
I feel sort off like Ferris Bueller playing the clarinet. Never had a lesson!
Anyway, I'm trying to improve. What I want to do is install ESXi on the physical hosts, then create LUNs to hold two fileserver images (in case one gets corrupt/fubar), one of which will be the active, the other will be the standby. At least this way, I don't improve the automation (although I'll get around to writing a script to switch the "active" server at some point soon), but I feel like I'm adding flexibility, plus I can use the ESXi hosts to hold other VMs, and the hardware won't be wasted, like it is now.
My questions are:
1) How stupid is my plan?
2) When it comes to the actual implementation, should I create a normal vmdk image on the LUN, or should I give it a "raw" partition (if that's even possible with ESXi?)
3) Is there a "good" way to use non-clustered fileservers?
Your plan is not nuts. As usual, there's more than a few ways to attack this based on what you're trying to achieve and how to protect your data.
First up, you can present a raw LUN to a VM using a "Raw Device Mapping". To do this:
Upside: fast to set up, fast to use, easy, can represent the disk to physical host if you find yourself needing to V2P down the track
Downside: you may lose some VMware-based snapshot/rollback options, depending on if you use physical or virtual compatibility mode
An alternate option is to create VMFS on the LUN to create a datastore, then add a VMDK disk to the VM living on that datastore.
In both cases, you're in a similar risk position should VMware or your VM eat the filesystem during a failure; one is not drastically better than the other although what recovery options will be available will be quite different.
I don't deploy RDM's unless I have to; I've found they don't buy me much flexibility as a VMDK (and I've been bitten by bugs that made them impractical when performing other storage operations (since fixed - see RDM section in that link))
As for your VM, your best bet for flexibility is to store your fileserver's boot disk as a VMDK on the SAN so that you can have other hosts boot it in the case of a host failure. Using VMware's HA functionality, booting your VM on another host is automatic (the VM will boot on the second host as if the power had been pulled; expect to perform the usual fsck's and magic to bring it up as in the case of a normal server). Note, HA is a licensed feature.
To mitigate against a VM failure, you can build a light clone of your fileserver, containing the bare minimum required to boot and have SAMBA start in a configured state and store this on each host's local disk, awaiting you to add the data drive from the failed VM and power it on.
This may or may not buy you extra options in the case of a SAN failure; best case scenario, your data storage will require a fsck or other repair, but at least you don't have to fix, rebuild or configure the VM on top. Worst case, you've lost the data and need to go back to tape... but you were already in that state anyway.
I'd stick with the vmdk images, just incase you move to using vmotion in the future, you never know you may get a budget for it.
If your machines aren't clustered, then as far as i'm concerned the best way to manage them is to try and spread the load as evenly as you can. I have 3 non clustered 2950's where the load from the most critical vms is as much as possible 1/3 on each. The theory being I'm unlikely to loose more than one box at once, so at least 2/3 will be able to continue operating unaffected.
From a power point of view it would probably more efficient to load up the machines to as near as 100% as you can and have other machines powered off, but it seems like putting all your eggs in one basket to me.
I wouldn't call myself an expert at this, its just what I do.
Hey Matt. There are lots of ways to slice up a solution when you use a virtualization solution. First off there have been lots of benchmarks showing Raw LUN (RDM) versus VMDK performance and the difference is typically shown to be negligible. Some things to be aware of with RDMs: Only certain clustering situations require using RDMs(MS clustering). RDM's have a 2TB limit but LVM can be used to work around this limit. RDM's are 'harder' to keep track of than giving a LUN to ESXi to use for VMFS and putting vmdk's on it. VMDKs (as mentioned) have some nice benefits: svMotion, Snapshots(can't snapshot a pRDM).
If running Free ESXi, here is how I might go about your situation. First off all data is in vmdk files on VMFS LUNS. Setup 2 VM's and use Heartbeat for failover of IP and Services. Heartbeat will shift the service IP over, and can handle scripting to unmount / mount the data LUN where appropriate. You could even script some VMware Remote CLI to ensure the 'down' VM gets powered off for fencing. With heartbeat directly coordinating between the systems risk of both accessing the data lun / running the same services should be extremely low. The key here is making sure mounting /unmounting of the data LUN and startup/shutdown of services is handled by Heartbeat, not the normal init mechanisms.
An alternative failover might be accomplished via monitoring system. When it detects the down host it could use VMware Remote CLI to issue a power off(to be safe) and then power on of the backup vm. In this situation failing back is fairly manual.
In my "tiny" environment I've not seen a VMDK get corrupted. What I've also come to realize is that if you have more than 2 ESX(i) hosts or a dozen VM's, you'll want to get vCenter to help keep track of everything. Some of the Essential/Plus packages are not too costly considering the benefits.
Matt, you know I don't use VMware but I have always used "RAW" with Xen. With just a few VMs that are lightly loaded I doubt you will see much of a performance difference. But when you start getting into more and more guests if all those guests are on the same file-system you will end up with queue depth issues. This is especially true of NFS backed storage. Its not so much that NFS server has the issues but most NFS client implementations suck.
I don't know of a good way to synchronize the vmdks if you are looking for redundancy (san failure). But if you use block devices you still have the possibility of using DRBD to replicate just the vms you want/need replicated.
I think you should ask your self "Do I ever plan to go back to physical servers"
If the answer is maybe then maybe you should stick to RDM. ESXi with RDM would (I think) require you to purchase something for your fiber to work (again not 100% sure on esxi).
We had a several machines that I just quickly moved from physical servers into ESX (4.0) using RDM. I had a mix of Linux and Windows machines (super easy for both platforms). We still have a few running legeacy FreeBSD(6.0 and older) on physical servers that we can not use RDM for because the old FBSD kernel does not support this. It was quick and required me to do nothing other than point my LUN and then install VMWare tools. Brain dead easy.. no converter no fuss...
Another thing you should ask your self is "What features of VMWare do I want to use?"
Depending on your answer to that you may have no choice other than VMDK. If you use your SAN for snapshots, and dont care about using vmware for that for example..
Some notes Ill share with you about what we ran into thus far.. Vmotion works equally great with RDM and VMDK, Storage Vmotion on the other hand only works correctly with non RDM, and trying to use Vmotion storage to go from RDM to VMDK sucks just use converter.. Most Linux distros have an open source vmware tools package which makes installing tools a non issue. The backup applince works really good and is free from vmware, but doesnt do as much stuff as we would like it to. I highly reccomend taking a class from vmware. The one I took was a week and was worth every penny VMWare support is awesome.. If you get a support contract and have to call they are top nothc.. I get frustrated getting to someone that can help me (to many menus..), but once I get them they ALWAYS come though with fast reliable support.