When creating a linux software raid device as a raid10 device, I am confused why it must be initialized. The same question applies for raid1 or raid0, really.
Ultimately most people would put a file system of some sort on top of it, and that filesystem should not assume any state of the disk's data. Each write will affect both disks in a raid10 or raid1 setup, where the N mirrors are written to. There should be no reason whatsoever for a raid10 to be initialized initially, as it will happen over time.
I can understand why for a raid5/6 setup where there is a parity requirement, but even then it seems like this could be done lazily.
Is it just so people feel better about it?
Raid 1, being a mirror, depends on all disks in a mirror being exact copies of each other. Take your random hard drive, and another random hard drive, and you possibly have different data there, thus violating this presumption. This is why initialization is needed. It simply copies contents of the first drive to others. Note that in some conditions you can get away with not initializing the drives - usually factory-new devices already have zeros all over the place, so you can simply ignore this. The
mdadm
option--assume-clean
does this, but warns you:If you don't do it, there is a discrepancy between the drives and it's read, there's no knowing what the drive will read. You should be pretty safe with a filesystem (but note below), because most probably you'll write before you read anything from that device, and then you're clear.
Note that at least Linux's
mdadm
will initialize the array in background. You can happily create FS on top of it the first second. The performance is going to suffer until the initialization is finished, but that's everything.But:
a) When doing
mkfs
some utilities check if there's something on that drive already. While this only touches a few well-known regions of drive, it reads before you write anything, thus putting you in danger.b) If you do a periodic resync of your array, the RAID device knows nothing of your FS. It simply reads every block from every device and compares those. And if you are not using a copy-on-write FS (e.g. ZFS or BTRFS) and never fill your FS, it's perfectly plausible for a block to stay uninitialized from FS perspective for years.
Why resyncing with RAID1 devices?
For the same reason you resync with RAID5 devices or any other level (except RAID0). It reads all data and compares/verifies RAID checksums (in RAID 5 or 6). If a bit was flipped in any way (because the HD memory got spontaneous flip, because the cellphones of you and your 5 neighbours just accidentally interferenced over this particular region of platter, whatever) it will detect inconsistency, but won't be able to help you. If, OTOH, one of the hard drives will simply report "I cannot read that block", which is more probable with a failing drive, you just have detected a failure early, and reduced time you're running in degraded mode (counting from the drive failure, not from when you notice it). Raid won't help you if one drive fails and a month later the other one fails if you don't notice the first failure in that month.
RAID10
Now, for RAID10 all of the above holds. After all RAID10 is just a clever way of telling 'I'm putting my two RAID1 devices in a RAID0 pair'.
Caveat:
This is all undefined behavour. Why I've checked on Linux, using
mdadm
, other software RAID implementations may behave differently. Other versions of Linux kernel and/ormdadm
tools than I'm using also may behave differently.Remember that RAID 1 is a mirror, and that RAID 10 is a stripe of mirrors.
The question is, on which disk in each mirror is the data valid? In a freshly created array, this cannot be known, as the disks may have different data.
Remember also that RAID operates at a very low level; it knows nothing of filesystems or whatever data might be stored on the disk. There might not even be a filesystem in use.
Thus, initialization in these arrays consists of the data from one disk in each mirror being copied as-is to the other disk.
This also means that the array is safe to use from the moment of creation, and can be initialized in the background; most RAID controllers (and Linux mdraid) have an option for this, or do it automatically.
Initial synchronization is needed because any differences between the mirrors would show up as errors during the periodic check.
And you should be doing periodic checks.
Simply put because two new disks are not expected to be mirror perfect copies of each other from the onset.
They need to be turned into perfect copies of each other.
In addition initialization includes setting up the metadata superblock with information about the array configuration as well.
The /proc/mdstat file should tell you that the device has been started, that the mirror is being reconstructed, and an ETA of the completion of the reconstruction. Reconstruction is done using idle I/O bandwidth. So, your system should still be responsive, although your disk LEDs will also be showing lots of activity.
The reconstruction process is transparent, so you can actually use the device even though the mirror is currently under reconstruction.
If you are using Linux LVM to create a RAID 1 (or 10) filesystem that you will immediately load with data, here's how you can avoid much of the unnecessary initialization I/O.
First create an ordinary linear (non-RAID) filesystem and load it with your data. Then convert it to a RAID filesystem with lvconvert. The mirror device will be initialized with your already-loaded filesystem data, so the only "unnecessary" I/O will be when the unallocated blocks in your already-loaded filesystem are copied. This is better than first copying every block from one uninitialized device to another and then writing your data to both devices. By serializing the two operations (loading the filesystem and then creating the mirror) you will also allow the disks to perform sequential I/O, which is much faster than the random seeking that occurs when writing to a RAID mirror pair that is still initializing.