I have a hand-me-down server that I'm setting up at home and it's got 6 72Gb hard disks (as well as 2 18Gb drives that I'm using for the OS). What is the best way to configure those 6 drives?
Should I RAID 5 or 6, or go with something simpler, like mirroring?
I'm planning to use it to hold a source control repository, and possibly data for a development SQL server.
The machine has a hardware raid controller. It is an old IBM server.
The Short
Best: RAID 10
Better: RAID 6
If you must: RAID 5 w/hot spare
Whatever you do, consider buying new disks. Older disks are more likely to be near their failure point and rebuilding RAID arrays is never fun.
The Long
It depends a lot on what your controller supports. Some older controllers support only RAID 0, 1, and 5. From your account of this as an older server, I'd suspect that at most it supports RAID 0, 1, 5, and 10. RAID 6 is still relatively new to the field.
RAID 0 is out because it increases your vulnerability to failure.
RAID 1 in three separate virtual disks would perform all right and give you solid fault tolerance, but dealing with the extra volumes will be annoying.
RAID 5 has been the old standby for a long time, and with disks your size would probably be all right. But it has some major problems, chief among which is a the likelihood of irrecoverable read errors that can hose your rebuild. Say a disk fails. No problem, your data is safe in the parity stripe across the others, right? Unless there's a read-error in that stripe on one of the disks - then that sector is irrecoverable. The effects of this can span from very minor (broken temporary file) to a failed array rebuild (reference). The chances of a read error go up with the number and size of the disks in your array. Another issue with RAID 5 is that drives often fail in groups - and you can only take 1 disk failure before you're hosed. With older disks like you're working with, you're asking for trouble. Finally, and this may not apply, but a lot of older cards have shit performance in RAID 5. Definitely test this before deploying.
RAID 6 fixes a lot of RAID 5 problems by using two parity stripes, so you lose the capacity of two disks. The advantage is that if you have a read error in one of your stripes, no problem, chances are that parity bit will be readable in the other. It can also sustain two disk failures so the likelihood of losing your whole array is much lower. It tends to be a bit slower than RAID 5 on the same controller, but the performance loss is more than worth it. This seems to be the current RAID sweet spot.
RAID 10 is the most expensive RAID implementation in common use (in terms of disk space), but this is a case of get what you pay for: a RAID 0 striped array mirrored onto another. It can sustain up to 50% disk failure as long as no mirrored pair fails. There is no parity calculation overhead so performance is excellent. By far my preferred RAID level right now.
There are more exotic RAID levels not widely supported: RAID 50 (striped RAID 5), RAID 60 (striped RAID 6), RAID 3 (byte-level parity), RAID 4 (block-level parity) and others. 3 and 4 have some subtle performance differences from 5. 50 and 60 are probably the way of the future.
In the end, I strongly recommend RAID 10. If space is a problem, buy bigger disks. Relative to the value of your data, doubling your disk size should be very inexpensive (this is not always true, so take advantage of it while you can).
Its a personal choice but I like to use RAID 5 for most of my stuff. Is pretty tried and proven and I've always had good experiences when trying to rebuild or recover from it.
Its going to depend a lot on how much space you need, how much speed you need, etc.
Check out this link for more in depth explanation of various RAID levels, etc:
http://decipherinfosys.wordpress.com/2007/01/30/what-raid-is-best-for-you
This link here might actually be a little better:
Selecting the Best RAID Level
EDIT:
Seem to be a lot of people very passionate about RAID 6 and I agree with all of their comments so I figured I'd update my answer to include the following:
In a production environment with that many drives I would choose RAID 6 over RAID 5.
In a home environment with (assuming) non-critical data I suggested RAID 5. Here was my line of thinking:
RAID 6 requires a second set of parity calculations to be made so that data from two failures can be rebuilt.
This additional parity calculation adversely affects write performance. The question is, how much?
Some benchmarks have shown that a RAID controller can suffer more than a 30% drop in overall write performance in RAID 6 compared to a RAID 5 implementation while read performance remains unaffected.
Don't get me wrong, I actually like RAID 6. I'm just not sure its necessary in the scenario described above.
On the flip side... you have a hand-me-down server and (presumably) want to experiment and play with it as if it was a production server. In that case... let's treat it like one and go with RAID 6.
One other thing I failed to mention in my original answer which I should have addressed is your other two operating system drives. I would recommend you MIRROR those two and put the six drives in either RAID 5 or RAID 6.
The choice is totally up to you as to whether you want the extra parity drive or not.
I hope this satisfies everybody. :-)
Mirror the two 18GB drives for the OS. Then use RAID-5 or RAID-6 for the other six drives. Your choice is really about redundancy. I doubt you're going to max the performance of the array.
RAID-5 you can lose one drive before you're in danger RAID-6 you can lose two
If you did do a RAID-10, you could lose up to 3 drive (one from each mirrored pair) but I don't think that's worth it for what you're doing. I'd just go with RAID-5, personally.
I know this isn't the answer you are looking for but I have been down this road before..., Buy one of the little devices which tells you how many watts a device is drawing.
If the machine is to be on 24x7 you can do the math and discover that the cost of running it are huge. It would be cheaper over a year to buy a mac mini or a small atom based system which will be able to do the same job as a server 3 or 4 years old just as fast. You can buy an external disc enclosure for modern discs which can have a mirror and the same amount of data if not a lot more for around £200.
For would have been mac mini £500, discs £200 total £700, total power around 30 watts.
However I managed to survive on a 300Mhz mips box that cost £100 and takes even less power but it runs debian with mysql and a web server and web proxy with ssh.
What raid level you pick should really depend on how much storage you need, and how much performance you need.
Personally, if you don't need more than 200gb for the repository and sql data then i'd use raid10, it'll be more redundant and perform better. If you need more space then go with raid5 (raid6 will only get you 72 more gb than raid10)
It's a question of performance vs usable disk space vs fault tolerance:
RAID10 = Fastest, can lose one disk in each pair, least volume
RAID5 = Slower, can lose one disk, more volume
RADI6 = Slower, can lose 2 disks, medium volume
One of the advantages of Raid 6 over Raid 5 is that if one drive has failed, and a second drive fails while rebuilding a replacement for the failed drive, Raid 6 will survive this. It would be catastrophic if that happens with Raid 5.
In your case, the drives are small, and the rebuild time for 1 drive will be low (compared to much larger drives).
Two other things to consider in your situation, in the case of a drive failure...
How long will it take to have a replacement drive paid for and delivered?
Will you need to continue to use the system during that time, or can you power everything down while you wait?
In a production environment, you would probably have a few spare replacements on hand, and you probably would want the system to remain under power and online with a possible short scheduled downtime.
So, in your case, in case of a drive failure, if you can manage to power down the system until you are ready to rebuild the replacement drive, then Raid 6 has little advantage over Raid 5. Further, considering Raid 6 will lose the space of an additional drive to parity, and has slower write times, Raid 5 will probably be your best choice.
On the other hand, if you'll want to keep the system powered up, or if you just feel better about the additional fault tolerance of Raid 6, and you won't miss the loss of some drive space, then I'd go with Raid 6.
The nice thing about a hand me down server is you can experiment with different configurations and see what performs the best and what you think will work best. I have a old server with 12 drives in various types of RAID for that very purpose.
I don't use RAID5 (or RAID6) in production anymore. My app is mostly writes, of course, and data consistency and resiliency are primary; but I really don't think I'd use RAID5 for almost anything these days. I prefer RAID 10.
Now, I've heard that EMC, and some other vendors have a proprietary RAID that is like RAID 5 but doesn't have some of the problems of RAID5. I would read up on that if you're really keen, but I'd just rather use RAID10.
RAID10 has better write performance, better survivability, better read performance. The only thing it doesn't have is so called "efficiency". But since my apps are mostly IO bound, I can't afford to be cheap here.
http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txtIf you're using Windows of some sort, make sure to align your partitions or dynamic volumes. Other than that, I agree with previous answers: RAID10 for write perf, RAID6 for efficiency.