2012-03-31 Debian Wheezy daily build in VirtualBox 4.1.2, 6 disk devices.
My steps to reproduce so far:
- Setup one partition, using the entire disk, as a physical volume for RAID, per disk
- Setup a single RAID6 mdraid array out of all of those
- Use the resulting md0 as the only physical volume for the volume group
- Setup your logical volumes, filesystems and mount points as you wish
- Install your system
Both / and /boot will be in this stack. I've chosen EXT4 as my filesystem for this setup.
I can get as far as GRUB2 rescue console, which can see the mdraid, the volume group and the LVM logical volumes (all named appropriately on all levels) on it, but I cannot ls the filesystem contents of any of those and I cannot boot from them.
As far as I can see from the documentation the version of GRUB2 shipped there should handle all of this gracefully.
http://packages.debian.org/wheezy/grub-pc (1.99-17 at the time of writing.)
It is loading the ext2, raid, raid6rec, dosmbr (this one is in the list of modules once per disk) and lvm modules according to the generated grub.cfg file. Also it is defining the list of modules to be loaded twice in the generated grub.cfg file and according to quick Googling around this seems to be the norm and OK for GRUB2.
How to get further by getting GRUB2 to actually be able to read the content of the filesystems and boot the system?
What am I wrong about in my assumptions of functionality here?
EDIT (2012-04-01) My generated grub.cfg:
It seems it first makes my /usr logical volume the root and that might be source of the failure? A grub-mkconfig bug? Or is it supposed to get access to stuff from /usr before / and /boot? /boot is on / for me - no separate boot logical volume.
After all, it was a Grub2 bug/issue with a degraded software raid array.
Grub2 1.9x has issues with booting from a degraded array. Booting in rescue mode onto the system and letting the raid recover itself has fixed the issue for the original setup in question.
Incidentally the setup works (at the moment: 2012-06-26) straight out of the box on Fedora 17, Arch (stable) and Gentoo (stable + latest grub2 bzr via Portage): Grub2 2.0+ has fixed the issue. With the Wheezy freeze hitting soon, I'm thoroughly hoping for the issue to be resolved via either jumping to 2.0 or backporting the fix.
For me this still affects Debian 6, 7; Ubuntu 8.04, 10.04, 12.04.
Letting the raid sync in a single user mode recovery setup is an acceptable workaround for a home system, but having a potential extra hitch for rebooting a production server (even a small office file server) makes one think twice.
Very good post, thanks a lot this helped me out quite a bit for installing an LVM - over - RAID on Debian Wheezy. Here are the steps I took to overcome the problem.
Update Grub2 to V2+
Add these lines to /etc/apt/sources.list
apt-get update
apt-get install grub2
Perhaps you have made the single partition too large and did not leave space enough for GRUB2 installation and it has overwritten parts of the LVM space. Something of a longshot. Try your steps to recreate your problem except this time use a single disk (skip the RAID), create the single partition exactly as you did before and then the rest of it. If I am right, then you should have the same behavior.
UPDATE: So, this answer is wrong. I was looking through the GRUB2 manual and found this section which states: