Ping a Specific Port

Question

T.J. Crowder

Asked: 2013-11-13 09:23:22 +0800 CST2013-11-13 09:23:22 +0800 CST 2013-11-13 09:23:22 +0800 CST

How to get rid of a stubborn 'removed' device in mdadm

772

One of my server's drives failed and so I removed the failed drive from all three relevant arrays, had the drive swapped out, and then added the new drive to the arrays. Two of the arrays worked perfectly. The third added the drive back as a spare, and there's an odd "removed" entry in the mdadm details.

I tried both

mdadm /dev/md2 --remove failed

and

mdadm /dev/md2 --remove detached

as suggested here and here, neither of which complained, but neither of which had any effect, either.

Does anyone know how I can get rid of that entry and get the drive added back properly? (Ideally without resyncing a third time, I've already had to do it twice and it takes hours. But if that's what it takes, that's what it takes.) The new drive is /dev/sda, the relevant partition is /dev/sda3.

Here's the detail on the array:

# mdadm --detail /dev/md2
/dev/md2:
        Version : 0.90
  Creation Time : Wed Oct 26 12:27:49 2011
     Raid Level : raid1
     Array Size : 729952192 (696.14 GiB 747.47 GB)
  Used Dev Size : 729952192 (696.14 GiB 747.47 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Tue Nov 12 17:48:53 2013
          State : clean, degraded 
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1

           UUID : 2fdbf68c:d572d905:776c2c25:004bd7b2 (local to host blah)
         Events : 0.34665

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       19        1      active sync   /dev/sdb3

       2       8        3        -      spare   /dev/sda3

If it's relevant, it's a 64-bit server. It normally runs Ubuntu, but right now I'm in the data centre's "rescue" OS, which is Debian 7 (wheezy). The "removed" entry was there the last time I was in Ubuntu (it won't, currently, boot from the disk), so I don't think that's not some Ubuntu/Debian conflict (and they are, of course, closely related).

Update:

Having done extensive tests with test devices on a local machine, I'm just plain getting anomalous behavior from mdadm with this array. For instance, with /dev/sda3 removed from the array again, I did this:

mdadm /dev/md2 --grow --force --raid-devices=1

And that got rid of the "removed" device, leaving me just with /dev/sdb3. Then I nuked /dev/sda3 (wrote a file system to it, so it didn't have the raid fs anymore), then:

mdadm /dev/md2 --grow --raid-devices=2

...which gave me an array with /dev/sdb3 in slot 0 and "removed" in slot 1 as you'd expect. Then

mdadm /dev/md2 --add /dev/sda3

...added it — as a spare again. (Another 3.5 hours down the drain.)

So with the rebuilt spare in the array, given that mdadm's man page says

RAID-DEVICES CHANGES

...

When the number of devices is increased, any hot spares that are present will be activated immediately.

...I grew the array to three devices, to try to activate the "spare":

mdadm /dev/md2 --grow --raid-devices=3

What did I get? Two "removed" devices, and the spare. And yet when I do this with a test array, I don't get this behavior.

So I nuked /dev/sda3 again, used it to create a brand-new array, and am copying the data from the old array to the new one:

rsync -r -t -v --exclude 'lost+found' --progress /mnt/oldarray/* /mnt/newarray

This will, of course, take hours. Hopefully when I'm done, I can stop the old array entirely, nuke /dev/sdb3, and add it to the new array. Hopefully, it won't get added as a spare!

1 Answers

Voted

T.J. Crowder · Answer 1 · 2013-11-15T14:12:44+08:00

Best Answer

T.J. Crowder

2013-11-15T14:12:44+08:002013-11-15T14:12:44+08:00

Well all of the usual options (listed in my question) failed, I had no choice but to:

Remove /dev/sda3 from the array
Nuke it
Create a new degraded array containing it and an empty slot
rsync the files from the old array to the new one
Stop the old array
Nuke /dev/sdb3
Add /dev/sdb3 to the new array

It started off saying "spare, rebuilding" but once it was rebuilt, it got added to the array as an active drive.

Of course, this meant dealing with the knock-on effects of the array having changed (and as this was the root file system, those were a royal pain).

As far as I can tell, something had got corrupted in the definition of the previous array, because:

A) Adding the drive should have Just Worked^(tm) like it did with the other two,

and

B) If not, shrinking and growing the array should have worked.

1

How to get rid of a stubborn 'removed' device in mdadm

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?