For a small business (less than 50 person environment) with between 10 - 20 virtualizable servers and a very modest hardware/software budget. Servers would server your vanilla setup (LDAP/AD, DNS, NTP, Web Servers, SQL Server, File Server etc...) How would you achieve HA?
- Servers have very little CPU usage.
- Modest RAM requirements (1GB/server avg)
- Small Disk IO
- Budget in the range of $10 - $15k for hardware/software
- Total HDD Needs in the range of 4TB - 8TB
As others have mentioned above; it depends on your actual requirements for availability.
Option 1) I know what it means and I really need HA and Fault Tolerance.
Assuming you want a decent level of availability. I would budget at least $50,000 to $100,000 as a starting point.
Lets assume for easy maths you have 20 users who need access to LDAP and File sharing. To do this you run up two physical machines attached to a SAN (required for most mid range virtualisation technologies). Physical machines run VMware with 2 virtual machines running Windows as AD replicas and file sharing.
Costs so far (REALLY rough maths): Windows + Cals ~ $5,000 VMware ~ $14,000 SAN ~ $35,000 Physical Servers ~ $25,000 (32gb of RAM) Network infrastructure ~ $5,000
Lets call the total $80k and include working/run up costs in there. Lets ignore power and connectivity costs for now.
This will buy you a scenario where your VMware instances can failover very quickly and the internal mechanics of Windows clustering should allow you have fault tolerance. It will allow you to scale out the VMs as required with future growth.
Assuming you now say VMware is too expensive and go with Xen you can drop off the VMware licensing fees however your biggest ticket items, SAN and Physical hardware still exist and you won't ever be able to avoid the Windows licensing. You will also have more administrative overhead and enjoy the wonderful learning curve that is Linux HA. I have not worked with Windows Hyper-V so won't quote there but I expect the hardware costs are unlikely to be reduced.
Option 2) I'd like to be able to recover quickly from a hardware failure
If by HA you mean "I'd like to be able to recover quickly from a hardware failure, but I can have decent maintenance time when I need it", a single Virtual Host with sufficient disk space will likely function, take reliable backups often enough and you will be able to recover to another machine quickly. Buy a decent enough single VM host and your downtime should be minimal.
This will allow you to dip your toes cheaply into virtualisation and learn quickly what you really want.
Option 3) I'm insane and wish to use older hardware I've hoarded over the last decade
Grab three machines, on one you will run Linux with an iSCSI target (I've used ietd with success), on the remaining you will run VMware ESXi. Configure the ESXi hosts to connect to the Linux iSCSI target as their storage and you have a very cheap SAN. Between the ESXi hosts you can manually balance whatever machines you need. If you run two Windows in clustered mode you can lose an ESXi host without too much consequence. Bonus with this is you can add extra ESXi hosts at little cost and transferring guests around only requires a reboot.
If you want to get tricky grab a 4th machine and run Linux DRDB to block replicate the "SAN" for redundancy there.
Ultimately
For any kind of VM machine migration and most clustering/failovers however you will need a SAN and while 4-8tb isn't a lot in consumer or single server storage it is still significant in terms of reliable quick SAN storage.
If you are serious about this I would actually just pick up a phone and ring your favourite vendor (HP/Dell/IBM et al), ask them what they can do for your budget and start from there. They have cookie cutter builds and are proficient at dropping clusters that will suit most shops easily enough.
** The really short answer **
For 50 Vanilla users I simply wouldn't, I'd back a server up often and collect the overtime running my patches on a weekend. The complexity simply isn't worth it, and if it is don't scrimp on costs.
Depends what you mean by HA - do you mean "I can't afford ANY downtime" or "If a VM or host dies I want it to automatically restart on another host, meaning it'll be out of action for the time it takes to restart"?
If the former then I'd either use Windows Clustering or VMWare's VSphere 'Fault Tolerance' mode, if the latter then I'd just use VMWare's regular HA service.
If your budget is modest, you should take a look at Red Hat's RHEV. Easy to set up and maintain, and if you're no Linux admin, there's no need to go into the cmd at all. http://www.redhat.com/virtualization/rhev/
That budget should be plenty assuming your willing to compromise on performance, and it's relatively easy to scale up performance later.
The typical building blocks would be VMware Essentials Plus at a list price of $3600 including 1 year support, which provides VMware vSphere with HA (automatic restart of OS on second physical server in the event of failure), a pair of servers like Dell PowerEdge R410 with Intel Xeon X5550 processors and 16GB RAM, and a Dell MD3000 storage array. You should be able to get all them for around $10000, plug the servers and storage together using SAS (it's certified, reliable and pretty easy), and off you go.
You can swap out the VMware vSphere for XenServer, RHEV, etc, and remove some of the costs, but you'll still probably want a support contract, and you may need to buy more memory. VMware Essentials Plus also comes with a pretty decent backup solution included, which you'll need to think about.
If you want to upgrade later for more performance, you can add a second processor to each server, and more memory, and extra disks.
I would also recommend Red Hat Enterprise Virtualisation if your on a budget, but I might be a bit biased as I am a Support Engineer for it. Please keep in mind this is my opinion and not that of Red Hat.
If you can get away with standard business hours support instead of 24x7 support you can get RHEV for $499 per host socket. A cheep setup would be 2x 8gb ram servers with either single or dual CPU sockets depending on the CPU workload your servers are generating. Running a Quad Core cpu with hyperthreading works well for low utilisation servers, however if your running multiple heavily utilised servers you might need a second CPU on each RHEV host. RHEV supports Memory overcommit and typically on server workloads you will overcommit to 1.5x the Host memory. Additionally upping both hosts to 12 or 16gb of ram should allow all guests to run on a single server in case of Host outage instead of a core subset. Hosts can be booted from a USB stick, PXE or CD. However a small hard drive is often used simply for convenience.
Other costs are simply storage and management node. RHEV 2.1 (current version) requires a system running Windows 2003 R2. This can be an older or lower spec machine (for testing I have a 1gb VM on my desktop running a test RHEV cluster in a lab).
Lastly there is storage and fencing (Disabling a host for high availability), both hosts should have iLO, DRACS or similar that are supported for fencing by RHEV. Most Dell, HP and IBM rack servers are supported in that regard and the feature should be standard on everything except the lowest end servers.
As for shared storage: Fibre, iSCSI and NFS are all supported. Depending on your existing infrastructure one of these might be cheeper then another option. NFS is often used in setting up test RHEV clusters as an old PC can be used running a NFS server, however disk IO will become a problem with a large number of guests. Additionally reliability concerns may make this an Issue.
Storage needs to be connected to both RHEV Hosts but not the Management interface so a 2 port FC disk array is suitable. I have seen some people make their own high performance iSCSI servers from a components, however I am assuming you would like something vendor supported. NetApp Filers are a somewhat common option as are any high performance NAS device that reliably supports NFS. I believe storage will likely be the biggest cost in your project unless you have some existing infrastructure to utilise.