tl;dr: I fat-fingered grub-install
, then "correctly" re-issued it targeting the /boot filesystem on /dev/sda1
but it isn't reading grub/grub.conf
unless I expressly tell it where to look using the grub prompt tools. How do I fix that?
I have a critical CentOS 5 system with multiple hard drives that aren't hot-swappable. That's a bad idea, by the way.
The first drive contains /boot, then two mirror mdraid partitions for the OS and data. The second drive contains just the two mdraid partitions.
The first drive is very slowly dying, so I added a third drive to prepare for the inevitable. I copied the partition layout of the first drive, added it to the mdraid mirror, then used dd
to clone sda1 to sdc1.
I had a hardware maintenance window last night and needed to reboot the machine anyway, so I figured I'd take the chance to switch sdc to the boot drive. As I only copied the partition layout and the first partition, not the entire drive, I figured that sdc wasn't bootable. So after adjusting fstab, I made sdc1 bootable and used grub-install
to ensure that grub could take care of things.
Only I fat-fingered the command and typed grub-install /dev/sda
.
It gave me a warning about not finding the drive in the BIOS drive list, so I assumed that it didn't do anything harmful. I re-issued the command targetting /dev/sda1
instead, but got the same error. Hmm. Oh well, it probably didn't do anything, right? Yeah. No.
When the system didn't come back up after reboot (printing GRUB GRUB GRUB over and over on the console), I knew I was screwed. Apparently what I did is irritatingly common.
I booted the machine into a live CD, used dd
to nuke the MBRs on both sda and sdc, mounted sda1's copy of /boot, issued the correct command (which involves asking it to probe the drive list and giving an actual filesystem location), and rebooted. What came up was the grub shell. I was able to issue root (hd0,0)
and configfile grub/grub.conf
to get into the boot menu, but I would have assumed that if I'd issued the command correctly to begin with then it would have seen the menu immediately.
So, my critical system is running fine. I'm only going to be able to reboot it once in the near future, so I'd like to get this taken care of correctly.
So, my questions:
- Is the current booting-into-grub-but-not-seeing-any-configuration fixable without re-running
grub-install
? I'm terrified of the thing now. - If I have to invoke
grub-install
again, what should be the correct way? I usedgrub-install --recheck --root-directory=/path/to/sda1/boot /dev/sda1
to get it into its current state.
I've got similar configuration: usually I'm creating /boot on mirrored mdraid partition and then installing grub on MBR of every single drive so server can boot in case of failure of any drive, the rest (i.e. everything except MBR program stage) is replicated with mdraid anyway,
Just run
You need to install grub on MBR, not the first partition. It will pickup stage 1.5 files and boot the kernel which will switch to the root of mdraid partitions etc.
Here is how my configuration looks like, actually there's nothing special to be done, yes, it's Centos 6, but it's the same thing:
device map
menu.lst