One one of our file servers I keep getting delay write failures to the OS disk array. Its built on a Raid1 array using an on-board raid card.
So far I've verified both harddisks are ok by pulling from the machine and doing full disk checks. (Obviously not at the same time) The write failures still happen with only one disk present and will not complete a check disk when in the server.
How do I go about checking that it definitely is the raid card or is what is done enough to safely say it is the card? Bare in mind that a faulty card means a motherboard swap.
Thanks
The only thing you haven't done, as far as I can tell, is to swap in fresh disks on the suspected-bad card. Don't try this with data you care about, but cloning the real data to a couple of spare disks (on another machine, followed by a check to catch a bad spare) is probably reasonable. (You do have spare disks around, right?)
The hypothesis here is that a recurrence of the problem demonstrates it is not the disks at fault.
First you should update drivers and firmware of the controller and read the known issues / release notes. Depending on your os and systems there are different things you can do.
Verify that the disks you are using are supported with the server in question.
Sometimes using third party disks will cause odd issues due to slightly different behaviour or the drives not supporting ioctls that the RAID card needs.