We look after a Fujitsu RX300 S4 server that has 6 x 500GB SATA drives in a RAID-6 array, running from an LSI MegaRAID card (built into the motherboard).
A couple of weeks ago, one hard drive flagged itself as being faulty (orange light on the drive bay, MegaRAIDcli software shows a firmware status of "Failed"). We ordered and replaced the drive, but after the rebuild started, a different drive flagged itself as faulty.
This has happened 3 times now - twice it flagged up different drives that had a fault, and once it has flagged up a drive that we have already replaced.
At the moment, two drives are showing faults - we don't know if the drives are actually failing, or whether the backplane or RAID card is at fault.
Has anyone experienced this before? Any tips on what to do next? We have a call into Fujitsu, but wondered if anyone out there had any pointers....
I feel for you. This kind of hardware problems are extremely stressful and annoying to debug.
Back in 2002 I had a "joy" of debugging a similar problem. After wayyyyy too much "Let's replace a HD" and similar server massaging the backplane was the actual fault. But that was an IBM server and a completely different story, anyway.
If possible, test the "faulty" drives with another server and see if they are functioning normally there. My guts tell me in your case it's not about the drives, something else is broken. Drives tend not to break like that.
This could be a faulty controller. It could be unreliable power. It could be bad SATA cables. It could just be extremely bad luck.