My older (FC11) install had reached EOL, and I tried to reinstall FC14 on its RAID1 root filesystem.
I'm now having suspicions that now, after the install FS is not fully raided. The question is if this is suspicion is true, and if so, how to fix it.
[root@atlas dev]# cat /proc/mdstat
Personalities : [raid1]
md127 : active raid1 sda[0]
732571648 blocks super external:/md0/0 [2/1] [U_]
md0 : inactive sdb[1](S) sda[0](S)
4514 blocks super external:imsm
unused devices: <none>
[root@atlas dev]#
md127 seems to be some child container of md0, but lists sda[0] as explicit device, but not sdb. I assume I'm running off sda reading this, and sdb is defunct.
The trouble is that the FS has seen quite some action since, so both discs can't be assumed to be in sync. sdb must probably be rebuilt. I do have a full backup though, so I'm willing to take calculated risks.
Note that the filesystem is the root device. (single user mode?)
Any explanations how to read mdstat output is also welcome. My guess is that I need to somehow add sdb from the md0 container to md127.
Kernel excerpt:
dracut: Starting plymouth daemon
dracut: rd_NO_DM: removing DM RAID activation
pata_marvell 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pata_marvell 0000:03:00.0: setting latency timer to 64
scsi6 : pata_marvell
scsi7 : pata_marvell
ata7: PATA max UDMA/100 cmd 0xcc00 ctl 0xc880 bmdma 0xc400 irq 16
ata8: PATA max UDMA/133 cmd 0xc800 ctl 0xc480 bmdma 0xc408 irq 16
dracut: Autoassembling MD Raid
md: md0 stopped.
md: bind<sda>
md: bind<sdb>
dracut: mdadm: Container /dev/md0 has been assembled with 2 drives
md: md127 stopped.
md: bind<sda>
md: raid1 personality registered for level 1
md/raid1:md127: active with 1 out of 2 mirrors
md127: detected capacity change from 0 to 750153367552
md127: p1 p2
md: md127 switched to read-write mode.
output of --detail --scan:
ARRAY /dev/md0 metadata=imsm UUID=e14582dd:1863c14a:fb0d98f0:5490080b
ARRAY /dev/md127 container=/dev/md0 member=0 UUID=c0cf0b37:bc944eec:ac93d30e:ee2f423e
/etc/mdadm.conf:
# mdadm.conf written out by anaconda
MAILADDR root
AUTO +imsm +1.x -all
ARRAY /dev/md0 UUID=e14582dd:1863c14a:fb0d98f0:5490080b
ARRAY /dev/md127 UUID=c0cf0b37:bc944eec:ac93d30e:ee2f423e
Update:
After waiting for the better part of the day I bit the bullet, verified my backups, booted to single user mode, and there I could simply mdadm --manage /dev/md127 --add /dev/sdb
Rebuilding took about 3 hours (of unpaid overtime). Everything seems to work and look intact.
I also remember I've meddled with fakeraid before deciding to go to software RAID, though afaik on other discs. Maybe the md0 is a left over from that, something that slipped in by an poorly restored /etc and then beat till it worked. The next reinstall it blew up in my face though. The experimenting to get it working probably reserved some info for it.
The scary thing is that both arrays contain the same discs now, AND that md0 is briefly enabled during boot. I get warnings that seem to signal that md127 is a child of md0, which makes deletion a bit scary. But I'll dig a bootdisk and give it a whir the next day I've time for system administration. (after making an incremental to yesterdays full backup of course)
md127 has two partitions ( a big root + swap), both mounted. md0 is not active (and I don't dare to activate it since it shares drives), so I don't know what partitions it has.
Since md127 now works (2/2 UU), it now is a matter of figuring out if md0 can be safely deleted (md127 child of md0?) and if so how, to avoid problems during future installs.
Probably need to kill some of the metadata on disk too, to avoid the next install picking it up.