BACKGROUND
We're considering building a virtual machine host to use for QA testing. Our primary goal is to be able to easily configure a set of virtual machines in a self-contained environment, that will simulate the main machines in our enterprise. We'll likely have a database machine, an app server, a web server, and one or two client machines within each environment.
We'd like to have between two and four environments active at any one time (i.e. up to twenty VMs simultaneously) with disk space for perhaps another 4 environments to be offline.
This is going to require a lot of horse power just for the basics. We're not going to be testing performance in these environments, it'll mostly be automated functional and integration testing, and possibly some manual testing performed by actual humans. The VMs don't need to behave like they have fast processors, but we would prefer to not have them bogged down by slow disk latency.
QUESTION
Given these goals, what do you think we should consider from a hardware standpoint? Is it worth splitting this over several 'smaller' machines rather than one honking great big one?
Memory will likely be your greatest restriction. Check out your virtual environment to see if it can share unused memory (ie. if you can allocate more than 100% of the available memory)
We have a hefty 8-core, 16Gbram host, and it will run about 20 VMs. I would think it would be more cost-effective, versatile and redundant to have 2 hosts half this size. With only 20 virtual machines however, anything beyond about 4 hosts would probably become difficult to manage.
If you want even more versatilty, locate the VMs on a SAN or other shared storage so that they can be run on either/any host.
Hard disk latency is going to be an issue, because each server is going to have a different access pattern, and it may hide synchronization errors because the disk is forcing everyone to go in a certain order (based on time the request was received), and slowly things down generally.
I would go for a striped mirrored array of very fast drives to limit your exposure to this problem. It'll still hide problems (notably certain race conditions) but not as much.
Load the server up with memory (16-32GB is not over-doing it), and go for an 8 core machine, or two four core machines with 8-16GB each.
-Adam
Have you considered using a blade chassis for this? All of our VMware systems at my company run in blade chassis which allows us a lot of flexibility and redundancy from a hardware standpoint. You can even have "hot-spare" blades, as well as the ability to add additional blades, or swap in more powerful blades at your leisure.
VMWare specifically, even has built in support for some advanced features in HP branded blade chassis.
The virtualization mechanisms that I've worked with required static allocation of memory for the virtual hosts, so you'll probably need pretty hefty RAM in the servers.
Regarding fewer big honking machines vs. more not so honking machines, it's worth keeping in mind that in a virtualization context, the physical host is effectively a single point of failure for all the virtual hosts.