So my understanding of one scenario that ZFS addresses is where a RAID5 drive fails, and then during a rebuild it encountered some corrupt blocks of data and thus cannot restore that data. From Googling around I don't see this failure scenario demonstrated; either articles on a disk failure, or articles on healing data corruption, but not both.
1) Is ZFS using 3 drive raidz1 susceptible to this problem? I.e. if one drive is lost, replaced, and data corruption is encountered when reading/rebuilding, then there is no redundancy to repair this data. My understanding is that the corrupted data will be lost, correct? (I do understand that periodic scrubbing will minimize the risk, but lets assume some tiny amount of corruption occurred on one disk since the last scrubbing, and a different disk also failed, and thus the corruption is detected during the rebuild)
2) Does raidz2 4 drive setup protect against this scenario?
3) Does a two drive mirrored setup with copies=2 would protect against this scenario? I.e. one drive fails, but the other drive contains 2 copies of all data, so if corruption is encountered during rebuild, there is a redundant copy on that disk to restore from? It's appealing to me because it uses half as many disks as the raidz2 setup, even though I'd need larger disks.
I am not committed to ZFS, but it is what I've read the most about off and on for a couple years now.
It would be really nice if there were something similar to par archive/reed-solomon that generates some amount of parity that protects up to 10% data corruption and only uses an amount of space proportional to how much x% corruption protection you want. Then I'd just use a mirror setup and each disk in the mirror would contain a copy of that parity, which would be relatively small when compared to option #3 above. Unfortunately I don't think reed-solomon fits this scenario very well. I've been reading an old NASA document on implementing reed-solomon(the only comprehensive explanation I could find that didn't require buying a journal articular) and as far as I my understanding goes, the set of parity data would need to be completely regenerated for each incremental change to the source data. I.e. there's not an easy way to do incremental changes to the reed-solomon parity data in response to small incremental changes to the source data. I'm wondering though if there's something similar in concept(proportionally small amount of parity data protecting X% corruption ANYWHERE in the source data) out there that someone is aware of, but I think that's probably a pipe dream.
Mostly you get things right.
You can feel safe if only one drive fails out of raidz1 pool. If there is some corruption on one more drive some data would be lost forever.
You can feel safe if two out of 4 drives fail in raidz2 pool. If there is... and so on.
You can be mostly sure about that but for no reason. With
copies
ZFS tries to place block copies at least 1/8 of disk size apart. However if you will encounter problems with controller it can saturate all your pool with junk quickly.What you think about parity is mostly said about raidz. In case of ZFS data can be evenly split before replicating yielding higher possible IO. Or do you want to have your data with parity on the same disk? If disk silently fails you will lose all your data and parity.
The good catchphrase about data consistency is "Are you prepared for the fire?"
When computer burns every single disk inside burns. There is a possibility of catching fire despite it's occurrence would be lower then of full disk fail. Full disk fail is almost common yet partial disk fail occurs more often.
If I'd like to secure my data I'd first check in which category this falls. If I want to survive any known disaster I'd rather think of remote nodes and distant replication at first place. Next would be drive failure so I wouldn't bother with mirrors or copies, zraid2 or zraid3 is a nice thing to have, just build it from bigger set of different disks. You known, disks from the same distribution tends to fail in common circumstances...
And ZFS mirror covers anything about partial disk failure. When some disk start showing any errors I'll throw in a third one, wait when the data be synced and only then I'll detach failed drive.
In general, focus on ZFS mirrors versus the parity RAID options. More flexible, more predictable failure scenarios and better expansion options.
If you're paranoid, triple mirrors are an option. But again, RAID is not a backup... You have some great snapshot and replication options in ZFS. Make use of them to augment your data protection.