I'm looking for a higher-performance build for our 1RU Dell R320 servers, in terms of IOPS.
Right now I'm fairly settled on:
- 4 x 600 GB 3.5" 15K RPM SAS
- RAID 1+0 array
This should give good performance, but if possible, I want to also add an SSD Cache into the mix, but I'm not sure if there's enough room?
According to the tech-specs, there's only up to 4 total 3.5" drive bays available.
Is there any way to fit at least a single SSD drive along-side the 4x3.5" drives? I was hoping there's a special spot to put the cache SSD drive (though from memory, I doubt there'd be room). Or am I right in thinking that the cache drives are simply drives plugged in "normally" just as any other drive, but are nominated as CacheCade drives in the PERC controller?
Are there any options for having the 4x600GB RAID 10 array, and the SSD cache drive, too?
Based on the tech-specs (with up to 8x2.5" drives), maybe I need to use 2.5" SAS drives, leaving another 4 bays spare, plenty of room for the SSD cache drive.
Has anyone achieved this using 3.5" drives, somehow?
Edit: Further Info on Requirements
To give a picture of the uses/requirements of this hardware:
We currently have several "internal" VMs, running on a VMWare ESXi 5.x host. But we only have a handful of hosts at the moment, it's a basic setup.
We recently started rolling out Dell R320s as our standard hardware for shared ESXi hosts. I'd prefer to keep to the R320s to try and keep our hardware (and hence our need for spares, upgrades, monitoring supports etc...) as standardised as possible. Having to keep a different set of drives as spare is better than having to keep entire spare chassis' on top of what we already have.
These VMs are primarily responsible for either our internal tools (such as Monitoring, Call Accounting, Intranet Websites). Or shared infrastructure; such as: DNS, SMTP Relay/Filtering, a small Set of shared websites; shared VOIP PBX.
Each of these roles are separated out into relatively small sized VMs, as needed. Almost all of which are Linux boxes. Some of these do have database loads, but I would consider them very small (enough that I have been OK with putting individual MySQL instances on each VM for isolation, where appropriate).
Generally, the performance is fine. The main catalyst for this new hardware is ironically the SMTP relay. Whenever we get hit by a decent sized mail-out from a customer, it causes a major backlog in our filters. I've confirmed that this is due to disk IO.
Given the nature of the other VMs running on this host, despite the fact there's obviously Disk IO contention, no real impact is noticed aside from the mail backlog -- VOIP is primarily all in memory; all the internal sites are so low in traffic that page loads are still reasonable; we've not had any reports of issues on the customer facing VMs on this particular host.
The Goals of this hardware
Shamefully, I really don't have any solid numbers in terms of the kinds of IOPS I want to achieve. I feel like it would be difficult to test, given the varying nature of VMs I want to put on there - it's not as if I have a single application I can benchmark against for a gaurunteed target.
I suppose my best bet would be to setup a test with the worst offenders (eg. DB-backed websites and the SMTP relays) and simulate some high load. This may be something I do in the coming week.
Frankly, my motivation is simply that I know that Disk IO is pretty much always a bottleneck, and so I'd prefer that for our infrastructure, that we have as much IO as we can reasonably afford to have.
I'll try to give you a rough idea of the goals in any case:
To reasonably survive the performance hits during a large customer initiated mail-out (which they're not supposed to do!). Of course, I understand this becomes "how long is a piece of string?" As you can't really predict how large any given mail-out might be. But basically, I know it's a disk IO issue so I'm trying to throw some extra IOPS on this particular set of hosts, to be able to handle a sudden burst of mail.
My thoughts are that with a large burst of small emails, this would typically be mostly random IO, which would seem best suited to SSDs. Though I'm sure we could do fine in the forseeable future without them.
**As stated previously, the above is really the catalyst for this. I realise I could put the SMTP relays onto their own physical hardware and basically be done with it. But I'm aiming for a more general solution in order to allow for all of our internal VMs to have IO available to them if it's needed).
To isolate a set of internal VMs from some customer facing VMs which are currently on the same host. To avoid performance issues from resource spikes on said customer VMs.
My plan is to have at least two hosts (for now) with the same VMs, and configure active/passive redundancy for each pair (Won't be using vCenter, but rather application specific failover).
I potentially will be deploying more VMs onto this host in future. One thing I am looking towards is a pair of shared MySQL and MS SQL VMs. If I were do to this, I'd definitely be looking at SSDs, so that we can have a central pair of DB servers which are redundant and high-performance. But this is further down the road, and I'd likely have dedicated hardware for each node in this case.
The Dell PowerEdge R320 is a lower-end 1U rackmount server. Your storage options within that chassis are either 8 x 2.5" small-form-factor disks or 4 x 3.5" large-form-factor drives. Due to the price point of this server, it's commonly spec'd with in the 4 x 3.5" disk combination...
Sidenote: one of the things that's happened in server storage recently is the reversal of internal disk roles.
So the combinations above influence server design (or vice-versa). The Dell R320 is usually configured with the larger 3.5" drives since the platform isn't typically used for more I/O intensive applications or where more expandability is required. Higher-end Dells (and HP ProLiants) are typically configured with small-form-factor 2.5" disks. This is to support more I/O capabilities within the chassis.
For your situation:
For CacheCade and your IOPS goals, have you measured your existing IOPS needs? Do you understand your applications I/O requirements and read/write patterns? Ideally, your design should be able to support a non-cached/cacheable workload on the spinning disks. So if you need 6 disks to get the IOPS you need, you should spec 6 disks. If 4 disks can support it, then you're fine. But since this approach will require using the 2.5" disks, you have more flexibility to tune for the application.
Also see: How effective is LSI CacheCade SSD storage tiering?
I don't believe there's space for what you want if you wish to use 3.5" disks - have you considered using 2.5" disks instead? if you did you could easily fit what you want into the machine and generally a 2.5" 10krpm disk will perform roughly on par with a 3.5" 15krpm disk - especially when fronted by a nice bit of SSD cache as you wish. I use this solution using HP kit (including their SmartCache, which is the same thing) and I'm very happy with them.
Other things I'd do:
use six disks, so there is always three of them in a RAID1.
This has the advantage that you do not immediately lose redundancy when one disk fails (so you are betting that the other mirror has no single broken sector), and, if your controller supports it, you can run the periodic consistency check on two disks and keep the third in regular operation so the I/O rate doesn't drop too significantly.
If it's a database load, use a dedicated disk or SSD for the indexes.
This is by far the greatest speed boost you can have -- provided you have a DBA who knows how to use this.
As stated before proper SSD cache setup results in using 8x2,5" disk backplane. Your post implies that Dell R320 server(s) is already in use. If so:
Are you ready to upgrade R320 backplane? Is SSD cache really needed here?
SSD cache useful mainly for random read/write performance.
With RAID10 on SAS 15K and NVRAM write-back cache powered by a hardware controller you already have a good random write performance. Even if CacheCade can leverage perfect SSD iops what about data protection from SSD failure?
For random reads you can consider just memory upgrade (up to 192GB supported by R320).
Or another solution: just use RAID10 on SSDs.