I have a CentOS box with 10 2TB drives & an LSI RAID controller, used as an NFS server.
I know I'm going to use RAID 1 to create 5TB of usable space. But in terms of performance, reliability & management, which is better, create a single 5TB array on the controller, or create 5 1TB arrays and use LVM to regroup them into one (or more) VGs.
I'm particularly interested in hearing why you would pick one approach or the other.
Thanks!
If the controller will allow you to provision a 10-disk raid 10 (rather than 2 8-disk units with 2 disks left over) that would probably be the best bet. It's simple to manage, you get good write performance with battery backed cache and the RAID card does all the heavy lifting, monitoring, management. Just install the RAID card's agent in the OS so you can reconfigure and monitor status from within the OS and you should be set.
Putting everything in the care of the RAID card makes the quality of the software on the card the most important factor. I have had RAID cards which have crashed causing the whole IO subsystem to "go away" and requiring a server reboot, I've even had instances of a card completely losing the array configuration requiring either it to be carefully reconfigured from the console or the whole thing to be restored from backups. The chances that you, with your one server, would see any particular problem are low, but if you had hundreds or thousands of servers you would probably see these kinds of problems periodically. Maybe newer hardware is better, I haven't had these kinds of problems in a while.
On the other hand it is possible and even probable that the IO scheduling in Linux is better than what's on the RAID card so either presenting each disk individually or as 5 RAID 1 units and using LVM to stripe across them might give the best read performance. Battery backed write cache is critical for good write performance though so I wouldn't suggest any configuration that doesn't have that feature. Even if you can present the disks as a JBOD and have battery backed write cache enabled at the same time there is additional management overhead and complexity to using Linux software raid and smartd hardware monitoring. It's easy enough to get set up but you need to work through the procedure to handle drive failures, including the boot drive. It's not as simple as pop out the disk with the yellow blinky light and replace. Extra complexity can create room for error.
So I recommend a 10-disk RAID 10 if your controller can do it or 5 RAID 1s with LVM striping if it can't. If you test out your hardware and find that JBOD and Linux RAID works better than use that but you should specifically test for good random write performance across a large portion of the disk using something like sysbench rather than just sequential reads using dd.
That's actually R10, not R1 - and it's R10 I'd use, i.e. let the OS see all ten raw disks and manage it 100% in software., anything else is needlessly over complex.
If you're stuck with 2TB LUNs due to 32-bittedness somewhere, I'd strongly lean towards making 5x 1TB RAID1 LUNs on the RAID card and throwing them into a volume-group to make one big 5TB hunk o' space. That way the card handles the write multiplication implicit in the RAID1 relationship, and you get 5TB of space.
If you can make LUNs larger than 2TB, I lean towards making that one big array on the RAID card. The strength of my lean depends A LOT on the capabilities of the RAID card in question. I don't know what it is, so I can't advise you. If I didn't trust it, I'd stick with the 5x 1TB RAID1 arrangement.
I'd suggest using the expensive raid controller to do the bulk of the raid work. LSI cards and the software they come with works quite nicely. When properly configured, they will send you email when intereting things happen to the array. Like when disks fail. There is nothing wrong with either of the two linux software raid options, but you've gone out and purchased a somewhat fancy raid card. Let it do the work.
Configure the disk array to expose one big device to Linux. If you would like to break up the final device into small volumes use lvm for that. One big physical volume, one big volume group and cut the volume group into whatever number of logical volumes you need.