My company is trying to figure out what type of SAN to purchase. This is specifically for database servers that are becoming IO constrained (storage is DAS right now, but we're hitting the limit of a single server and we'd like to add clustering as well).
We need a solution that will produce around 3000 IOPS long-term (we currently peak around 1000 IOPS). Most of our database operations are small reads/writes. Based on discussions with HP engineers and others online, an HP P2000 with 24 SAS HD's in a RAID 10 configuration will deliver just short of that speed for ~$20K. Add in controllers and other items to build out the SAN puts us right around our max budget of $30K.
But online, I see that many SAS SSD's deliver speeds of 80,000 IOPS+. Is this realistic to expect? If so, would it be realistic to get a P2000 or similar entry level SAN and throw a few SSD's in there? Our databases are small, only a couple TB total. If we did this, we'd have the money leftover to buy a second SAN for mirroring/failover, which seems prudent.
The rule of thumb I use for disk IO is:
75 IOPs per spindle for SATA.
150 IOPs per spindle for FC/SAS
1500 IOPs per spindle for SSD.
As well as IOPs per array also consider IOPs per terabyte. It's not uncommon to end up with a very bad IOP per TB ratio if doing SATA + RAID6. This might not sound too much, but you will often end up with someone spotting 'free space' on an array, and want to use it. It's common for people to buy gigs and ignore iops, when really the opposite is true in most enterprise systems.
Then add in cost of write penalty for RAID:
Write penalty can be partially mitigated nice big write caches and in the right circumstances. If you've lots of sequential write IO (like DB logs) you can reduce those write penalties on RAID 5 and 6 quite significantly. If you can write a full stripe (e.g. one block per spindle) you don't have to read to compute parity.
Assume a 8+2 RAID 6 set. In normal operation for a single write IO you need to:
With a cached full stripe write - e.g 8 consecutive 'chunks' the size of the RAID stripe you can calculate parity on the whole lot, without needing a read. So you only need 10 writes - one to each data, and two parity.
This makes your write penalty 1.2.
You also need to bear in mind that write IO is easy to cache - you don't need to get it on disk immediately. It operates under a soft time constraint - as long as on average your incoming writes don't exceed spindle speed, it'll all be able to run at 'cache speed'.
Read IO on the other hand, suffers a hard time constraint - you cannot complete a read until the data has been fetched. Read caching and cache loading algorithms become important at that point - predictable read patterns (e.g. sequential, as you'd get from backup) can be predicted and prefetched, but random read patterns can't.
For databases, I'd generally suggest you assume that:
most of your 'database' IO is random read. (e.g. bad for random access). If you can afford the overhead, RAID1+0 is good - because mirrored disks gives two sources of reads.
most of your 'log' IO is sequential write. (e.g. good for caching, and contrary to what many DBAs will suggest, you probably want to RAID50 rather than RAID10).
The ratio of the two is difficult is hard to say. Depends what the DB does.
Because random read IO is a worst case for caching, it's where SSD really does come into it's own - a lot of manufacturers don't bother caching SSD because it's about the same speed anyway. So especially for things like temp databases and indexes, SSD gives a good return on investment.
I can speak on the specifics of what you're trying to accomplish. Honestly, I would not consider an entry-level HP P2000/MSA2000 for your purpose.
These devices have many limitations and from a SAN feature-set perspective, are nothing more than a box of disks. No tiering, no intelligent caching, a maximum of 16 disks in a Virtual Disk group, low IOPS capabilities, poor SSD support, (especially on the unit you selected).
You would need to step up to the HP MSA2040 to see any performance benefit or official support with SSDs. Plus, do you really want to use iSCSI?
DAS may be your best option if you can tolerate local storage. PCIe flash storage will come in under your budget, but capacity will need to be planned carefully.
Can you elaborate on the specifications of your actual servers? Make/model, etc.
If clustering is a must-have, another option is to do the HP MSA2040 unit, but use a SAS unit instead of iSCSI. This is less costly than the other models, allows you to connect 4-8 servers, offers low-latency and great throughput, and can still support SSDs. Even with the Fibre or iSCSI models, this unit would give you more flexibility than the one you linked.
Your analysis is pretty correct.
Use a few HDDs for lots of GBs, and lots of HDDs to a few IOps.
Use a few SSDs for lots of IOPs, and lots of SSDs for a few GBs
Which is more important for you? Space is the big cost-driver for SSD solutions, since the price-per-GB is much higher. If you're talking about a 200GB database needed 4K IOPs, a pair of SSDs will get you there. Or a 24 disk array of 15K drives, leaving you lots of space for bulk storage.
How many IOps you'll actually get out of those SSDs varies a lot based on the storage infrastructure (ewwhite will elaborate on that), but it's reasonable to get those kinds of speeds. Especially with Raid10, where parity isn't being computed.
I recently built a pair of storage servers for my employer, using Dell C2100 chassis, running FreeBSD 10.1 with twelve 2TB 7200rpm Western Digital "SE" enterprise SATA drives. The drives are in a single ZFS pool consisting of two 6-drive RAIDZ-2 virtual devices (vdevs). Attached to the pool are a pair of Intel DC S3500 SSDs which are supercap protected against power loss, they are used as both SLOG and L2ARC. Load testing this server over iSCSI, I was able to hit 7500-8200 IOPS. Our total cost including hard drives was about $2700 per server.
In the time that these BSD-based systems have been running, one of our HP MSA2012i SAN units has experienced two controller faults, and our other MSA2012i unit corrupted a large NTFS volume requiring 12 hours of downtime to repair.
Dell and HP sell you 10% hardware and 90% promise of support that you never end up being able to utilize.