My organisation is going to invest in either RedHat virtualization with KVM or VMware ESX. Some of our work involves disk intensive bio-informatics stuff such as BLAST, which we have found to mostly be bottle-necked by the disk rather than the CPU.
I know little about network attached storage, SAN, iSCSI etc., but would appreciate some pointers on which storage technology might be worth focussing on. It seems to me that the fastest solution would be locally attached solid state storage, but then we can't have the advantages of VM migrations for fail-over and load balancing.
What storage solution, for about $6-8k, should we be considering?
Direct attached storage will be much, much faster than any low-end SAN or NAS storage. Unless you have very stringent DR requirements (and from your storage budget I can surmise that you probably don't, even if your business stakeholders think they do) then a hot failover strategy isn't worth the effort. You will get far more unscheduled downtime from configuration issues than you are ever likely to get from hardware failure with modern server equipment. Therefore, keeping your system simple will probably actually improve your real downtime.
My answer: Keep it simple. Use a bare metal server of appropriate capacity with VMs for secondary apps if you really need them. Linux is much better at juggling multiple workloads on a single server than windows. You can chroot apps if you need to.
A modern server from a reputable manufacturer is pretty reliable with redundant fans, disks and power supplies. They don't fail all that often. You're not going to get a credible DR facility for $4,000 and if that's your storage budget the business don't actually consider this to be a mission critical system.
Edit: $6000-8000 is still not going to get a meaningful DR capability with performant disk. On this level of budget direct attach and KISS will get you by far the best bang for buck.
I second what ConcernedofTunbridge said. Forget about low end SAN or NAS. The performance will most likely be worse then with regular DAS.
If you need fast disk access in DAS, go the SSD route. The performance of SSD beat any hard drive disk by orders of magnitude.
EDIT : As I have gotten a -1 for this here are a few sources : comparison of 7.2k disk and X-25E and Comparison of SAS/SSD/SATA.
And quoting AnandTech :
You will have the choice between SLC (Single Level Cell) and MLC (Multi-Level Cell). SLC is much (at least 5 times) faster then MLC when it comes to writes. Depending on how many servers you are wanting to set up, you should put them in a RAID10 to get even more performance.
A 32GB Intel X-25E (SLC) can be had for 375$. A 160GB Intel X-25M (MLC) can be had for 405$.
Agree with ConcernedOfTunbridgeWellsW - direct attached storage is the way to go.
My knowledge of genetics is a bit rusty, but I would reckon that your problems arise from disk bandwidth rather than disk latency (these are different problems with different solutions). Storage volume is less of an issue these days, even with more complex indexing structures.
But even still $4k is not a huge budget. By the time you went out and bought a SCSI enclosure, power supply, adapter and cables you're going to have nothing left to buy disks with!
OTOH, the fact you are looking for a storage system implies that you've already got CPU and memory.
Starting from scratch, the most cost effective way to achieve what you need would probably be a system with lots of mid-range SATA disks (probably as RAID5) its hard to find cases/PSUs which will accomodate this easily. Certainly I'd recommend upgrading your case/PSU rather than using a seperate enclosure connected via SCSI or network.
But I would expect that if you did find that CPU was an issue then you'd get a lot more performance about running the comparison on a DSP/GPU.
HTH
C.
All are worth discussing. Virtualization has an overhead (x%, depends on technology), but that is it. All the rest is identical.
The rest really depends on your needs. That said....
...does not really get you anywhere. Not on a higher end solution. Forget SAN.
Your best bet budget wise wuolsd be to look over to SuperMicro - they have server casings with 24-72 discs. OPne of that, a nice high end SAS controller (Adaptec 5805QZ) and you can plug in a LOT of discs in a nice RAID 10 configuration. I have one of those (24 disc case) using right now 12 discs in 2 RAID gruops (one for virtualization, one for a virtualized db server as pass through disc).
This is pretty much your best bet - at the end, IO depends on waht the subsystem can deliver. On a budget, a SAN is out of the question. ISCSI can be, but low cost solutions are simply less (a lot) efficient.