Ping a Specific Port

Question

Markus A.

Asked: 2013-06-22 10:12:40 +0800 CST2013-06-22 10:12:40 +0800 CST 2013-06-22 10:12:40 +0800 CST

Major issues with fsck of 10TB ext3 RAID 6 (memory allocation failed, etc.)

772

I recently added a 7th 2TB drive to a linux md software RAID 6 setup. After md finished reshaping the array from 6 to 7 drives (from 8 to 10TB), I was still able to mount the file system without problems. In preparation for resize2fs, I then unmounted the partition and ran fsck -Cfyv and was greeted with an endless stream of millions of random errors. Here is a short excerpt:

Pass 1: Checking inodes, blocks, and sizes
Inode 4193823 is too big.  Truncate? yes
Block #1 (748971705) causes symlink to be too big.  CLEARED.
Block #2 (1076864997) causes symlink to be too big.  CLEARED.
Block #3 (172764063) causes symlink to be too big.  CLEARED.
...
Inode 4271831 has a extra size (39949) which is invalid Fix? yes
Inode 4271831 is in use, but has dtime set.  Fix? yes
Inode 4271831 has imagic flag set.  Clear? yes
Inode 4271831 has a extra size (8723) which is invalid Fix? yes
Inode 4271831 has EXTENTS_FL flag set on filesystem without extents support. Clear? yes
...
Inode 4427371 has compression flag set on filesystem without compression support. Clear? yes
Inode 4427371 has a bad extended attribute block 1242363527.  Clear? yes
Inode 4427371 has INDEX_FL flag set but is not a directory. Clear HTree index? yes
Inode 4427371, i_size is 7582975773853056983, should be 0.  Fix? yes
...
Inode 4556567, i_blocks is 5120, should be 5184.  Fix? yes
Inode 4566900, i_blocks is 5160, should be 5200.  Fix? yes
...
Inode 5628285 has illegal block(s).  Clear? yes
Illegal block #0 (4216391480) in inode 5628285.  CLEARED.
Illegal block #1 (2738385218) in inode 5628285.  CLEARED.
Illegal block #2 (2576491528) in inode 5628285.  CLEARED.
...
Illegal indirect block (2281966716) in inode 5628285.  CLEARED.
Illegal double indirect block (2578476333) in inode 5628285.  CLEARED.
Illegal block #477119515 (3531691799) in inode 5628285.  CLEARED.

Compression? Extents? I've never had ext4 anywhere near this machine!

Now, the problem is that fsck keeps dying with the following error message:

Error storing directory block information (inode=5628285, block=0, num=316775570): Memory allocation failed

At first I was able to simply re-run fsck and it would die at a different inode, but now it's settled on 5628285 and I can't get it to go beyond that.

I've spent the last days trying to search for fixes to this and found the following 3 "solutions":

Use 64-bit linux. /proc/cpuinfo contains lm as one of the processor flags, getconf LONG_BIT returns 64 and uname -a has this to say: Linux <servername> 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 GNU/Linux. Should be all good, no?
Add [scratch_files] / directory = /var/cache/e2fsck to /etc/e2fsck.conf. Did that and every time I re-run fsck, it adds another 500K *-dirinfo-* and an 8M *-icount-* file to the /var/cache/e2fsck directory. So that seems to have its desired effect as well.
Add more memory or swap space to the machine. 12GB of RAM and a 32GB swap partition should be sufficient, no?

Needless to say: Nothing helped, otherwise I wouldn't be writing here.

Naturally, now the drive is marked bad and I can't mount it any more. So, as of right now, I lost 8TB of data due to a disk-check?!?!?

This leaves me with 3 questions:

Is there anything I can do to fix this drive (remember, everything was fine before I ran fsck!) other than spending a month to learn the ext3 disk format and then trying to fix it manually with a hex editor???
How is it possible, that something as mission-critical as fsck for a file-system as popular as ext3 still has issues like this??? Especially since ext3 is over a decade old.
Is there an alternative to ext3 that doesn't have these sorts of fundamental reliability issues? Maybe jfs?

(I'm using e2fsck 1.42.5 on 64-bit Debian Wheezy 7.1 now, but had the same issues with an earlier version on 32-bit Debian Squeeze)

3 Answers

Voted

David Schwartz · Answer 1 · 2013-06-22T10:37:46+08:00

David Schwartz

2013-06-22T10:37:46+08:002013-06-22T10:37:46+08:00

Just rebuild the array and restore the data from a backup. The whole point of RAID is to minimize downtime. By messing around and trying to fix a problem like this, you just increase your downtime defeating the whole purpose of RAID. RAID doesn't protect against data loss, it protects against downtime.

3

Markus A. · Answer 2 · 2013-06-22T12:00:25+08:00

After playing around with fsck some more, I found some remedies:

Preventing the 'Memory allocation failed' error

fsck seems to have a major issue with memory leakage. If it is run on a file-system with some problems (real or imaginary), it will "fix" them one-by-one (see screen dump in original question). As it does so, it consumes more and more memory (maybe keeping a change-log?). Pretty much without bounds. But, fsck can be cancelled at any time (Ctrl-C) and restarted. In this case, it will continue where it left off, but it's memory use is reset to next-to-nothing (for a while).

With this in mind, the three things that need to be done are:

Use 64-bit Linux (it seems to make a difference in how fsck can use the available memory)
Add a ridiculously huge swap partition (I used 256GB, fsck runs for about 12 hours with it)
Frequently abort and restart fsck (how frequently depends on the size of the swap partition)

NOTE: I have no idea if canceling and restarting fsck brings with it any other dangers (probably does), but it seems to work for me.

Dealing with the resulting damage, if the 'Memory allocation failed' error occurs (IMPORTANT!)

fsck handles the Memory allocation failed error in the worst possible way: I destroys perfectly good data. I'm not sure why, but my guess is that it does some final data-write to disk of things that it had kept in memory, which (due to the error) have meanwhile gotten corrupted.

In my case, the most visible problem was that when I restarted fsck after the error, it sometimes reported a corrupted super-block. The problem is: I have no idea how corrupted the super-block was, especially in the cases where it didn't report it as corrupted. Maybe, if restarted after the error, it then uses incorrect drive meta-data found in the corrupted super-block to do all further checks and ends up fixing "issues" that aren't really there, destroying good data in the process.

Therefore, if fsck ever dies with the Memory allocation failed error, it needs to be restarted using the -b parameter to use a backup super-block that (hopefully) wasn't corrupted by the error. The location of the backup super-blocks can be found using mke2fs -n /dev/....

Since I don't know what happens if fsck dies with the backup super-block selected, I usually just abort fsdk immediately when it gets to Pass 1: Checking inodes, blocks, and sizes and restart it again without -b, at which point it starts without complaining about a bad super-block. I.e. it seems like the first thing fsck -b does is to restore the main super-block.

Now the one we've all been waiting for:

How to mount a file-system without letting fsck run to completion

This, I found by accident: It turns out that after running fsck -b and aborting it as soon as it prints the Pass 1: Checking inodes, blocks, and sizes (before any errors are found) the file-system is left in a mountable state (Yay! I got pretty much all of my data back!).

(Note: There may be another way using mount -o force, but it wasn't needed in my case.)

How to avoid all these issues in the first place

There seem to be two ways:

Use ext3, but keep a perfectly up-to-date backup. Then, frequently run fsck with parameter -N. If it shows any problems, delete the entire fs and restore everything from the backup. Since, in this scenario, one would be relying very heavily on the backup, I suggest keeping a backup of the backup. Also, use a copy-tool that somehow ensures that the restore does not create random errors in the process (An MTBF of a trillion r/w-ops is small when dealing with TB's of data). Make sure to plan for the resulting down-time, too, as a multi-TB restore probably takes a while...
My recommendation: Do NOT use ext3! The fs-design and associated tools (here fsck) aren't robust enough for real production use (yet?). The way fsck handles the memory error and the fact that the error occurs in the first place are not acceptable in my mind. I will be trying xfs from now on, but don't yet have enough experience with it to tell whether it's any better.

Futile32 · Answer 3 · 2018-09-07T08:29:50+08:00

Futile32

2018-09-07T08:29:50+08:002018-09-07T08:29:50+08:00

Unfortunately, I'm not able to "add a comment" but had to chime in here and thank the Op. I had a RAID6 failure and manually assembled 6 of the 8 drives with closely matching Event Counts. However I wasn't able to mount the assembled array.

It appeared that I needed to use a backup Super-block. Running fsck -b <location> ... eventually died with out-of-memory, which led me to this thread/question.

In short, using fsck -b <location>... and then doing ctrl+c allowed me to mount my array and recover my files.

Thanks!

0

Major issues with fsck of 10TB ext3 RAID 6 (memory allocation failed, etc.)

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?