I'm using mdadm for several RAID1 mirrors. md7 is an N-way mirror consisting of 3 spinning disks (all flagged write-mostly) and an SSD:
md7 : active raid1 sdd1[0] sde5[3](W) sdf5[4](W) sdc1[1](W)
234428416 blocks [4/4] [UUUU]
md6 : active raid1 sdf6[0] sde6[1]
1220988096 blocks [2/2] [UU]
md2 : active raid1 sdb6[0] sda6[1]
282229824 blocks [2/2] [UU]
md1 : active raid1 sdb2[0] sda2[1]
19534976 blocks [2/2] [UU]
md0 : active raid1 sdb1[0] sda1[1]
192640 blocks [2/2] [UU]
The entire system has hung 3 times in the past 2 weeks, requiring a hard reset. For the time being, I'm going to assume the system hang is unrelated to my md issue, although I can't completely discount that possibility. Each time we've rebooted, md7 has required a rebuild, but I can't figure out how to tell from the logs which disk triggered the rebuild. I thought iostat might be able to help me while the RAID was still rebuilding:
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 43.39 1038.34 558.83 223108 120075
sdb 66.88 1445.47 648.86 310588 139420
sdc 36.42 12.99 22256.81 2792 4782320
sdd 190.75 23227.78 331.14 4990954 71152
md0 2.11 21.39 0.23 4596 50
md1 173.72 1855.87 522.14 398770 112192
md2 11.68 65.84 27.59 14146 5928
md6 27.42 149.83 69.51 32194 14936
sde 75.83 70.81 22326.91 15214 4797384
sdf 79.31 99.41 22326.91 21360 4797384
sr0 0.04 2.61 0.00 560 0
md7 202.31 1287.41 331.07 276626 71136
...but it looks to me like md7 is using sdd to rebuild all the other disks in that RAID. I thought maybe this was simply because sdd is an SSD and all the other disks are marked write-mostly, but in that case, it should only rebuild the one disk that was out of sync (unless all the spinning disks just happened to be out of sync, which seems unlikely to me).
Another theory that I have is that all the spinning disks are always out of sync upon reboot simply because the SSD's writes are so fast that it has time to finish writing a block while the others are still writing, then the system just happens to lock up before the other disks finish writing that block?
So, how do I tell which disk(s) triggered the resync? Is the fact that I have an n-way mirror with mixed SSD and spinning disks possibly responsible for the fact that all the spinning disks are always rebuilt after one of these freezes, or does the md driver guarantee that a block isn't considered written on one disk until it's successfully written on all disks?
I understand that (at least linux) raid works something like a filesystem for these purposes - if the system crashes while it's in use, it will need to be checked on reboot. So the cause of your system's crashes may not be any disks in the array.
As Michael points out above, the hangs and consequent unclean shutdown are the reason you are seeing your RAID rebuild. The kernel md driver rebuilds unclean arrays in order to ensure they are truly in sync, since a hang, or crash or powerloss won't guarantee which writes actually got flushed out to disk.
Now, as to why
sdd
is getting used, the first thing to understand is that in an unclean shutdown, the actual array, as opposed to an individual member device, is marked dirty. In the manpage I linked above, the following is said about RAID-1:In your example, the
md7
array has partitions on drivessdc
,sdd
,sde
&sdf
, but if you look at yourmdstat
output:note how the first partition, marked with a
[0]
, is onsdd
, namely,sdd1
. That's the reasonsdd
is being used -- it's the first drive inmd7
.