I am about to replace an old hardware RAID5 array with a Linux software RAID1 array. I was talking to a friend and he claimed that RAID5 was more robust than RAID1.
His claim was that with RAID5, on read the parity data was read to make sure that all the drives were returning the correct data. He further claimed that on RAID1 errors occurring on a drive will go unnoticed because no such checking is done with RAID1.
I can see how this could be true, but can also see that it all depends on how the RAID systems in question are implemented. Surely a RAID5 system doesn't have to read and check the parity data on a read and a RAID1 system could just as easily read from all drives on read to check they were all holding the same data and therefore achieve the same level of robustness (with a corresponding loss of performance).
So the question is, what do RAID5/RAID1 systems in the real world actually do ? Do RAID5 systems check the parity data on reads ? Are there RAID1 systems that read from all drives and compare the data on read ?
RAID-5 is a fault-tolerance solution, not a data-integrity solution.
Remember that RAID stands for Redundant Array of Inexpensive Disks. Disks are the atomic unit of redundancy -- RAID doesn't really care about data. You buy solutions that employ filesystems like WAFL or ZFS to address data redundancy and integrity.
The RAID controller (hardware or software) does not verify the parity of blocks at read time. This is a major risk of running RAID-5 -- if you encounter a partial media failure on a drive (a situation where a bad block isn't marked "bad"), you are now in a situation where your data have been silently corrupted.
Sun's RAID-Z/ZFS actually provides end-to-end data integrity, and I suspect other filesystems and RAID systems will provide this feature in the future as the number of cores available on CPUs continues to increase.
If you're using RAID-5, you're being cheap, in my opinion. RAID 1 performs better, offers greater protection, and doesn't impact production when a drive fails -- for a marginal cost difference.
I believe that the answer depends on the controller/software for example it is quite common for mirroring systems to only read one disc out of a pair and therefore be capable of delivering the wrong data. I note that if your results depend on that data the when the data is written to both discs it is then corrupted on both discs.....
From the pdf under SATAssure(tm) Plus:
"Revolutionary SATAssure technology delivers enterprise-class data protection and reliability using large capacity, inexpensive SATA disk drives. SATAssure operates on all read operations, ensuring data integrity and automatically corrects problems in real-time – all without the performance or capacity penalty found in traditional storage systems. Reduce drive RMAs with a new ability to power-cycle individual drives. "
It is interesting that some manufactures make a fuss about the fact they they always compute parity, this leads me to think that it is relatively uncommon on hardware controllers. It is also of note that systems such as ZFS and WAFL (netapp) do parity calculations for every read.
With RAID-5, parity is generally only read on array rebuild, not on general read. This is so reads can be more random and faster (since you don't have to read and calculate parity for an entire stripe every time you want 1K of data from the array).
With RAID-1, generally reads are stepped across drives whenever possible to give increased read perfomance. As you noted, if the RAID subsystem tries to read both drives and they differ, the subsystem has no way of knowing which drive was wrong.
Most RAID subsystems depend on the drive to inform the controller or computer when it is going bad.
So is RAID-5 "more robust"? The answer is, it depends. RAID-5 lets you get more effective storage for a given number of disks than RAID-1 does; although to give effective storage beyond one disk, RAID-1 needs to be combined with RAID-0, either as a stripe of RAID-1 arrays, or a RAID-1 across two RAID-0 stripes.
(I prefer the former, since a single drive failure will take out a single RAID-1 element, meaning that only a single drive will require rebuilding. WIth the latter, a single drive failure kills a RAID-0 element, meaning that HALF the disks will be involved in the rebuild when the drive gets replaced.)
This also leads to discussions of "phantom writes", where a write is reported as successful by the drive electronics, but for whatever reason the write never makes it to the disk. This does happen. Consider that for a RAID-5 array, when you have a drive failure the array MUST read ALL sectors on ALL surviving drives PERFECTLY in order to recover. NetApp claims that the large size of drives plus the large size of raid groups means that in some cases your chances of failing during a rebuild can be as bad as one in ten. Thus, they are recommending that large disks in large RAID groups use dual-parity (which I think is related to RAID-6).
I learned this at a NetApp technical discussion given by a couple of their engineers.
No common RAID implementation typically checks the parity on data access. I've never seen one. Some RAID5 implementations read parity data for streaming reads to prevent unnecessary seeking (cheaper to throw away every nth block than to cause the drive to seek over every nth block). RAID1 implementations can't check because they read from both disks for performance (well, in the vast majority of RAID1 implementations. A handful let you pick, which can be useful if one disk is much slower than the other and it's not write-intensive load.)
Some do check with a background 'scrubbing'. In that case, RAID6 wins as it can recover the data, and RAID5 and RAID1 are in the same situation, you can identify but not fix. (This is not strictly true as the drive could detect a bad CRC, return an error, and let you rewrite the block from parity. This happens quite commonly).
If you want data integrity, store a hash with every block (or record, or however it's divided up) at the application layer. Sybase and Oracle do this (I believe at the page level) and I've seen it on many occasions save a gigantic database. (e.g. controller starts returning bad data, sybase crashes with a clear error, therefore no writes were done when the database was running on failing hardware with an inconsistent state).
The only filesystem solution and the only RAID solution that does this for you is ZFS.
Is your friend talking about the parity bit that is involved in some RAID levels, or the checksum of the data written to disk?
If they're on about parity, then RAID1 does not have a parity bit - You have two copies of the same data. There should be a checksum performed by the disk to ensure what was written to disk matches what came down the wire
RAID5 does have a parity bit. This means that you can lose a disk in your RAID set, and continue as if nothing happened. Still, there should be a checksum performed of the data written to disk to ensure it matches what came down the wire
In this instance, checksums are totally independent of RAID that may or may not be performed with a bunch of disks
Edited to add: You mentioned moving from hardware RAID to software RAID. The preference is always hardware RAID over software RAID. If can you purchase the hardware required to give the RAID level you want to implement, I'd suggest you go for that. This will enable all the parity calculations to be performed by the RAID card, rather than the host. Therefore freeing up resource on the host. There are no doubt other benefits, but they escape me at the moment
That would depend on the raid implementation type (hw/sw), the disks, the raid controller if any, and it's features.
it does make some slight sense, but not really :) what happens is - if wrong data is written, on a mirror it will be sent to both drives, and on raid5 parity for it will be generated and spread across the drives. data read/write checking is done by the disk and controller firmware, and has nothing to do with raid levels.
as I said, the checks aren't part of the raid algorithm, although some controllers might have something additional implemented.
the robustness of the array is up to the quality of the drives (2.5" drives tend to live longer than 3.5" due to decreased RV rates; in my experience NEVER buy maxtor SCSI/SAS drives - they have horrible firmware glitches), the environment (temperature and humidity control), the controller itself (does it have a BBU? is the firmware up to date? is it real raid or fakeraid?), the amount of PSUs in the server, the UPS quality etc.
I don't know this, but it seems to me unlikely that it does. Remember that in order to calculate the parity, it will have to read the block from all drives in your RAID set and then do math to determine correctness, whereas if it doesn't, it just just does the read off of one drive.
Also, if your read is for less than one block, a parity-check read would have to expand it to a full block, whereas a regular read wouldn't. (Assuming, of course, that the RAID block is bigger than the disks' blocks. I think that reads from disk have to be of full blocks. If not, my point is even more valid.)
So, from my point of view, yes, it could do that, but if it did, it would be inefficient, and I doubt that any are implemented that way.
Again, though, I have no personal knowledge of actual implementations.
It doesn't really make sense to. What do you do when you find a parity mismatch? (How do you know which block is wrong?)
For random reads checking parity would be expensive. Normally you could service a random read by just looking at a single disk, but if you want to check parity you'd need to read all disks on each read. (That might still make sense if there were anything you could do about it!)
Note that RAID-1 has this problem too -- which makes sense when you look at a RAID-1 as a two disk RAID-5.
I've been thinking a bit about the claim, that RAID-1 should be faster on reads than RAID-5, since it reads across both drives at once.
Now, since parity is not read on RAID-5 unless the array needs a rebuild, it actually equals a RAID-0 array in terms of reading, am I correct?
RAID-0 is generally regarded as being the fastest level (although it should be named "AID", since there's no redundancy whatsoever). :-D
Speaking of Linux software RAID, a simple test - using hdparm - confirms this theory: my RAID-5 arrays always shows a higher read speed than my RAID-1 arrays.
BUT: A degraded array performs much slower than a normal running array, it seems! I've just tested this with Fedora 9, running on 4 WD 1 TB drives with different RAID levels. Here's the results:
Degraded RAID-5: read speed 43 MB/sec Normal RAID-5: read speed 240 MB/sec (!) RAID-1: read speed 88 MB/sec
Since the allowed loss of disks is the same in RAID-1 and RAID-5 (namely one), I think RAID-5 should outperform RAID-1 in every aspect - giving more capacity in relation to number of disks used in array and same fault tolerance. This leads to a conclusion which states, that RAID-6 outperforms every other RAID level, since it's as fast as RAID-0 on normal read (no parity read from the two parity disks), and still fault tolerant in case of loss of an array member. ;-)
Personally, I think that the ultimate test of a RAID system is how well it can withstand failure. IN this case, both RAID5 and RAID1 can handle single drive failures but neither will survive any more than that.
As for your question on the parity bit, I would think that it is dependent on the RAID drivers. It will definitely be read during reconstruction but on normal use, it would not make much sense to do so as bandwidth would be wasted on it.