I have a server (Dell PowerEdge 2950) with a PERC 6/i and six disks. Two of the disks are in a RAID-1 with a third as a hot spare. I have been asked to make a copy of the RAID-1 data on one of the other disks in such a way that we can store the disk offline and, if needed, boot the system from the offline disk. I also need to be able to periodically update the offline disk while the system is running. (This disk would be part of our disaster recovery and would also serve as a "last known good" disk in case of something going drastically wrong with the server.)
My instinct is to partition one of the extra disks to match the RAID-1's partitioning, then mount the partitions, rsync
over the data and put GRUB on the disk, but I cannot figure out how to configure the PERC 6/i's RAID to get this to work. If I create and then remove a RAID with the to-be-offlined disk, all of the data is deleted when the RAID is removed. If I create a RAID and just remove the disk, the controller becomes very sad. Since we're using RAID for the system disks, it doesn't look like there's a way to access one of the other disks without making it part of a RAID. Is there a way to do what I want?
I'm running Linux (RHEL 5) and using Dell's OMSA CLI programs (omreport and omconfig).
I felt that my question was mostly about how I could get the PERC 6/i to do what I want. In the absence of any pointers about that and based on the fact that I don't see a way forward from my reading of the documentation, I'm going to bypass the RAID controller and put my bootable system copy on a USB disk. This will supplement our RAID and tape backups and serve as a quick response to a few failure scenarios that would otherwise require downtime while restoring the system from tape.
Repeat after me: "RAID is NOT a backup system."
What you are trying to do is NOT the way RAID is intended to be used.
Can you do it? Sure:
You can mirror the RAID-1 (RAID 1+1 - a mirror of the mirror), and remove half of the top-level mirror set. As you've surmised the controller will be "very sad" when you do this as the RAID is now degraded (half the mirror is gone), and it's going to have to rebuild when it gets new disks.
All those rebuilds increase the chance of an unrecoverable error on the bottom-level mirror, which can eventually leave you in a situation where your rebuild caused you to knock your server offline and lose data.
Also note that if you lose your primary hardware there's no guarantee you can recover those disks in another machine: If you don't have a controller with the same firmware revision you may have a set of disks that are only useful after spending a few grand on a data recovery company to get your data back off of them.
Similar challenges & risks exist if you're using
mdraid
or other software RAID tools.Bottom line: This is a BAD Idea. Don't do it.
What should you do instead?
Spend the time to do a proper analysis and deploy/test a backup system and proper disaster recovery plan instead. There are plenty of excellent backup tools for Linux that are specifically designed for this kind of work (Bacula is a popular choice, and there's even a whole section of the Bacula manual dedicated to bare metal we-lost-everything-but-the-backup-tapes restores).
As voretaq7 correctly points out: "something going drastically wrong with the server" includes the loss of the controller. So if you don't go the "official" desaster recovery way he suggests then it would make sense IMHO to copy the contents of the hardware RAID to a software RAID (so that you can easily add a second disk for mirroring while getting a replacement for the hardware). This means that the target disk must be a few sectors bigger than the source disk. And maybe the boot loader configuration cannot be simply copied; depends on your partitioning. But you can reinstall Grub on the backup disk before you need the backup disk. In that case you should ensure that both your controllers module and mdraid are part of your initrd. Just try to boot the backup disk after you're done.