On mdadm softraid 6 (about 12 disks, 60TB) there appeared accidental misswrites in large files (>100GB). All system was checked - RAM, NIC, LSI RAID card. The most suspected is the LSI because its battery BBU is flat and write back / write through is not correctly set. In theory each chunk of data is backed up with XOR redundancy, RAID5 1x, RAID 6 2x. But this comes in act only if some of active disks fails.
In the mdadm is there a command which could start the complete data consistency check taking in account the XOR backup? It means can I identify the miswritten chunks?
After I discarded the BBU I want to know, which files are good and which files are corrupted and have to be replaced. If there is no way to solve it, I should have to create the array from the scratch and get all the files from their backup.
Run (replace
md125
with your actual array):It will read all the drives, compute parity stripes and check if they're correct. For RAID6 it will also correct a single-mismatch errors (when just one drive is went out of sync) by using all the rest drives, thanks to dual parity which enables detection of dual errors and correction of single errors, include those could have happened due to the disk bit error rate. This is important for modern very large disks.
It'll report any important messages to kernel log readable via
dmesg
. You can monitor status via/proc/mdstat
file ormdadm --detail /dev/md125
.It is very useful to run the check periodically as it not only will correct misswrites, but also detect and kick out of array dying devices early, so better set up this check to be invoked via system scheduler (cron or systemd timers). Some Linux distros (e.g. Debian) do this by default.
While first parity syndrome is really simply XOR, the second one is not. A second syndrome is calculated using quite sophisticated mathematics called Galois field. Linux software RAID uses a field that enables RAID6 with no more than 257 active devices (not counting hot spares). This calculation is quite intensive for the CPU, so it's better to run this check when your system doesn't have much load. You can also limit its load by limiting the check rate by setting
/sys/block/md125/md/sync_speed_max
with some arbitrary value (200000
, meaning 200 MB/sec is the default). The Linux also tests and reports optimal algorithm for RAID redundancy syndrome calculation for your system on boot, so you can check which one it will use and how fast it'll perform by reading boot logs.You can also interrupt the running check by sending
idle
: