I'm using Windows Server 2008 software RAID volumes. So, recently I've started to receive error in System event log: "The device, \Device\Harddisk7\DR7, has a bad block."
Meanwhile volume in Disk Manager is marked as "Failed Redundancy". I could command to "Reactivate Volume" and it's starts to re-sync, but after a while it stops and returns to previous state.
chkdsk on failed disk do not help. What can I do besides removing, reformatting and restoring from backup? Thank you.
Update When I just offline one of the disks (#7 for instance), "Failed Redundancy" label mutates to "Failed", so removing disk and replacing it with empty one, I thing will not save the volume.
You need to replace the drive with the bad block. Each time you try reconstruct the array if will fail once it hits that block.
You should not need to fully reformat though - you should be able to remove the failed drive, replace it, partition it as needed and rebuild the array (unless, of course, the array is RAID0 rather than RAID1, RAID5 or similar).
I've not done any of the above with Windows' software RAID, so someone else will have to help you there if the place to issue the relevant commands are not obvious. To help them help you, it would be a good idea to add your current disk layout in your question.
If the volume you back up to is the one with the defective drive you need to make a backup to something else, at least temporarily. Regardless, you have to replace the bad drive. It really is that simple.
I also suggest you reconsider your backup strategy. A proper backup is something that will allow you to restore your data even if all your current systems are destroyed. What you have isn't a backup, it's merely a second copy.
I would add to the comments made by Evan and others regarding testing your backups. In addition to regular test restores you need to know that you can restore that data to a system other then the one you backed up. If the crunch comes and you need to do a disaster recovery you may have to do so using completely different hardware. Unless this can be done the business is vulnerable.
I'd like to summarize my experience here.
When Windows Server (actual for 2008 R2) software RAID5 array is in "Resyncing" state it is as vulnerable to disk failure as striped volume. So you have two options:
Again. If one of volume drives fail during resyncing you are pretty much screwed up. All you could do is create volume again, run chkdsk on it and restore data from backup.