I'm having a problem with the raid array in a server (Ubuntu 10.04).
I've got a raid5 array of 4 disks - sd[cdef], created like this:
# partition disks
parted /dev/sdc mklabel gpt
parted /dev/sdc mkpart primary ext2 1 2000GB
parted /dev/sdc set 1 raid on
# create array
mdadm --create -v --level=raid5 /dev/md2 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1
This has been running fine for a couple of months.
I just applied system updates and rebooted, and the raid5 - /dev/md2
- didn't come back on boot. When I re-assembled it with mdadm --assemble --scan
, it seems to have come up with only 3 of the member drives - sdf1 is missing. Here's what I can find:
(Side-note: md0 & md1 are raid-1 built on a couple of drives, for / and swap respectively.)
root@dwight:~# mdadm --query --detail /dev/md2
/dev/md2:
Version : 00.90
Creation Time : Sun Feb 20 23:52:28 2011
Raid Level : raid5
Array Size : 5860540224 (5589.05 GiB 6001.19 GB)
Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Fri Apr 8 22:10:38 2011
State : clean, degraded
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
Events : 0.140
Number Major Minor RaidDevice State
0 8 33 0 active sync /dev/sdc1
1 8 49 1 active sync /dev/sdd1
2 8 65 2 active sync /dev/sde1
3 0 0 3 removed
(Yes, the server's called Dwight; I'm a The Office fan :) )
So it thinks one drive (partition really) is missing, /dev/sdf1.
root@dwight:~# mdadm --detail --scan
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=00.90 UUID=c7dbadaa:7762dbf7:beb6b904:6d3aed07
ARRAY /dev/md1 level=raid1 num-devices=2 metadata=00.90 UUID=1784e912:d84242db:3bf6c10c:6278edbc
mdadm: md device /dev/md/d2 does not appear to be active.
ARRAY /dev/md2 level=raid5 num-devices=4 metadata=00.90 UUID=1bb282b6:fe549071:3bf6c10c:6278edbc
What, what, /dev/md/d2? What's /dev/md/d2? I didn't create that.
root@dwight:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md2 : active raid5 sdc1[0] sde1[2] sdd1[1]
5860540224 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
md_d2 : inactive sdf1[3](S)
1953513408 blocks
md1 : active raid1 sdb2[1] sda2[0]
18657728 blocks [2/2] [UU]
md0 : active raid1 sdb1[1] sda1[0]
469725120 blocks [2/2] [UU]
unused devices: <none>
Ditto. md_d2? sd[cde]1 Are in md2 properly, but sdf1 is missing (and seems to think it should be an array of its own?)
root@dwight:~# mdadm -v --examine /dev/sdf1
/dev/sdf1:
Magic : a92b4efc
Version : 00.90.00
UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
Creation Time : Sun Feb 20 23:52:28 2011
Raid Level : raid5
Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
Array Size : 5860540224 (5589.05 GiB 6001.19 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 2
Update Time : Fri Apr 8 21:40:42 2011
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : 71136469 - correct
Events : 114
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 3 8 81 3 active sync /dev/sdf1
0 0 8 33 0 active sync /dev/sdc1
1 1 8 49 1 active sync /dev/sdd1
2 2 8 65 2 active sync /dev/sde1
3 3 8 81 3 active sync /dev/sdf1
...so sdf1 thinks it's part of the md2 device, is that right?
When I run that on /dev/sdc1, I get:
root@dwight:~# mdadm -v --examine /dev/sdc1
/dev/sdc1:
Magic : a92b4efc
Version : 00.90.00
UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
Creation Time : Sun Feb 20 23:52:28 2011
Raid Level : raid5
Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
Array Size : 5860540224 (5589.05 GiB 6001.19 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 2
Update Time : Fri Apr 8 22:50:03 2011
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 1
Spare Devices : 0
Checksum : 71137458 - correct
Events : 144
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 0 8 33 0 active sync /dev/sdc1
0 0 8 33 0 active sync /dev/sdc1
1 1 8 49 1 active sync /dev/sdd1
2 2 8 65 2 active sync /dev/sde1
3 3 0 0 3 faulty removed
And when I try to add sdf1 back into the /dev/md2 array, I get a busy error:
root@dwight:~# mdadm --add /dev/md2 /dev/sdf1
mdadm: Cannot open /dev/sdf1: Device or resource busy
Help! How can I add sdf1 back into the md2 array?
Thanks,
- Ben
mdadm -S /dev/md_d2
, then try adding sdf1.