Two of four of my servers currently have mismatch_cnt
about 40000 and that worries me. We are using RAID10 setup. Manual states, that
However on RAID1 and RAID10 it is possible for software issues to cause a mismatch to be reported. This does not necessarily mean that the data on the array is corrupted. It could simply be that the system does not care what is stored on that part of the array - it is unused space.
We do not use any swap files on our servers. One of the server's HDDs has SMART self-check failing and Available_Reservd_Space is too low. Hosting provider says, that it replaces HDDs only when they are physically faulty.
I think I do not get the real meaning and the usefullness of this param. What could be other reasons for this parameter to have such a big value? How could that be that the system does not care about what is there on that part of array if that's a mirrored one? Due to security considerations a system should sync free space also I think and then - what's left?
Are there any reliable ways to estimate the risk of having a particular HDD in a server?
Often, two reasons are given for high
mismatch_cnt
on a RAID1/10 array:The above reason are harmless: while they do point to differences in the array (basically, a de-synchronized array), they are about unused disk space.
However, there is a much more concernig and dangerous
mismatch_cnt
cause: an hardware issue (ie: faulty power supply delivering inconsistent power and/or a misbehaving disk DRAM chip) can alter in-flight data, leading to many inconsistencies between the two disks.You can find more information on this thread in the linux-raid mailing list.