How smart are RAID controllers when they're under load?
Given a moderate-to-high-end controller and cabinet (i.e. current standard Dell or HP off-the-shelf kit...), RAID 5, 10gb fiber, at least 3/4 utilized space, and lots of small non-sequential with heavy mixed read/write access to files such as would be found on a file or email server...
Practical implementation question: Given the same amount of space, what would be faster, a small number of large, high speed drives (i.e. 4x 1TB 15,000 rpm) or a larger number of smaller, moderate or low speed drives (i.e. 9x 500GB 7200 or 10,000 rpm) drives?
Theoretical question: Do RAID cabinets/controllers know where the head of a drive currently is and the location they need to seek to so that they can assign reads to the drive with the least head-travel distance? Or does it matter?
What other variables come into play with minimizing response time and maximizing throughput with a large numbers of non-sequential small files on a shared storage array? Note that cache doesn't come into play that heavily because of the nature of the data.
From your description your workload will be heavily random access, so the limiting factor is random IO operations per second. On a RAID-5 you will get (in practice) slightly less than one I/O read or write per revolution of the disk per physical spindle. In this situation more physical disks and faster RPM means more throughput.
Under a heavily random I/O situation where the working set of the data requests overflows the cache the throughput of the system is a function of the number of physical disks x the speed of the disk. The more disks the better, the faster the better.
On your theoretical question, the disks support a feature known as 'tagged command queuing'. This allows the controller to dispatch I/O requests to the disks and recieve the requests asynchronously. The internal board on the disk is aware of the position of the disk heads and can optimise the operations by completing them in the order it decides is optimal. The algorithm for this is some variant on 'Elevator Seeking'
Results can be returned out of order, but are tagged with the request number so the RAID controller knows which request to match the reply to (hence the 'tagged'). SATA has a slightly different protocol known a 'native command queuing' that does something similar.
In this case the RAID controller does not have to be aware of the physical position of the disk heads as this is managed by the firmware on the disk itself.
On a heavily random access workload a single pair of FC loops will support quite a lot of disks. For a streaming workload such as video FC will become a bottleneck more quickly.
Some controllers can support impressively large cache sizes. You may want to try and estimate the working set size and see if you can actually upgrade your controller's cache to accomodate it. If you have a background in statistics you might be able to build a monte-carlo model based on usage statistics gathered from the requests.
Another possiblility to improve performance might be to use a layer of solid-state disks for fast storage, although this depends on whether your controller supports this configuration.
IOPS, IOPS and IOPS. Those are what you need to consider. You might get more IOPS from a faster spinning drive. Alternatively you might get more IOPS from a higher spindle count.
There is a good comparative article by Adaptec that answers almost exactly the same question.
If you have some drives in mind (ie. from a specific vendor) then you can run the maths.
There are no 1TB 15krpm disks currently so this is theoretical at best but we have similar usage scenarios and for a mid-level system such as you're looking at I'd strongly recommend the use of 2.5" SFF SAS disks, ideally at 15krpm. A decent sized array of these isn't very expensive and is seriously fast for seqential traffic and extremely fast for random too. Take a look at the HP MSA 70.
I've done lots of testing on Dell servers with Perc5/i and 6/i controllers. We resell Dells, and I can rarely resist the temptation to speed test them whenever a new server or new disk configuration passes through our doors. NB I test using a disk tester of my own devising. I make no special claims for my tester other than several years experience suggests it correlates well with the speed of the server when it's on site. See http://ratsauce.sourceforge.net/index.html#diskthrasher for the gory details.
Anyhow, my experience suggests that at least up to six disk RAID5, each extra disk produces a useful increase in speed. I've tested with 15K SAS disks and also with (Western Digital RE2 and RE3) SATA disks, and while the SATA disks are obviously a lot slower at random access, a six SATA disk RAID5 is at least as good as a four 15K SAS disk RAID5. So I would go for more cheaper disks.
Your question about head positioning on reads would only apply to RAID1 or RAID10, and I don't use these anymore so I can't comment. From my somewhat dim memory a RAID1 on a Perc 4e/Di is noticably faster than a single disk, suggesting that there is some benefit. However the difference was small, maybe 10-20% increase in speed, and this could just have been due to better caching on the RAID controller.
Incidentally, if you downoad my diskthrasher app I include lots of test results from various disk configs.
JR
In general the answer will depend on the system you are looking at. If you want the system to be fast I would usually look at the IOP's the drive is capable of and use that to decide. However I would have a tendency to prefer a large number of drives than a smaller.
There are a number of other things that can be done in linux to increase performance.
Generally the heads in a Raid system should be in sync in a Raid5/6 system. Make sure it doesn't have to read an area before it can write it by making the writes large enough.
I think there's just way too many variables to be considered here to be able to make a general suggestion. Comparing the number of drives and their speeds isn't practical, because other factors like number of heads and platters will also come in to play, so it depends on the drive model as well.
Ideally you'd pick a few different models that meet your application requirements and benchmark them against each other with your actual workload, then buy the one that fits the bill.
As long as you remember that most "big" SAN's have ~50% overhead the end result is usually fairly fast.
The big one is the NAS frontends (so far with the exception of NetApp) can't handle any serious metadata load, so be careful.
Raid controllers in theory could know the last expected head location and use it when re-ordering requests (this is after all a traditional part of elevator algorithms), but these days giant RAM caches are just easier.