It seems like my filesystem got corrupted somehow during the last reboot of my server. I can't fsck
some logical volumes anymore. The setup:
root@rescue ~ # cat /mnt/rescue/etc/fstab
proc /proc proc defaults 0 0
/dev/md0 /boot ext3 defaults 0 2
/dev/md1 / ext3 defaults,errors=remount-ro 0 1
/dev/systemlvm/home /home reiserfs defaults 0 0
/dev/systemlvm/usr /usr reiserfs defaults 0 0
/dev/systemlvm/var /var reiserfs defaults 0 0
/dev/systemlvm/tmp /tmp reiserfs noexec,nosuid 0 2
/dev/sda5 none swap defaults,pri=1 0 0
/dev/sdb5 none swap defaults,pri=1 0 0
[UPDATE]
First question: what "part" should I check for bad blocks? The logical volume, the underlying /dev/md
or the /dev/sdx
below that? Is doing what I am doing the right way to go?
[/UPDATE]
The errormessage when checking /dev/systemlvm/usr:
root@rescue ~ # reiserfsck /dev/systemlvm/usr
reiserfsck 3.6.19 (2003 www.namesys.com)
[...]
Will read-only check consistency of the filesystem on /dev/systemlvm/usr
Will put log info to 'stdout'
Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
###########
reiserfsck --check started at Wed Feb 3 07:10:55 2010
###########
Replaying journal..
Reiserfs journal '/dev/systemlvm/usr' in blocks [18..8211]: 0 transactions replayed
Checking internal tree..
Bad root block 0. (--rebuild-tree did not complete)
Aborted
Well so far, let's try --rebuild-tree
:
root@rescue ~ # reiserfsck --rebuild-tree /dev/systemlvm/usr
reiserfsck 3.6.19 (2003 www.namesys.com)
[...]
Will rebuild the filesystem (/dev/systemlvm/usr) tree
Will put log info to 'stdout'
Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
Replaying journal..
Reiserfs journal '/dev/systemlvm/usr' in blocks [18..8211]: 0 transactions replayed
###########
reiserfsck --rebuild-tree started at Wed Feb 3 07:12:27 2010
###########
Pass 0:
####### Pass 0 #######
Loading on-disk bitmap .. ok, 269716 blocks marked used
Skipping 8250 blocks (super block, journal, bitmaps) 261466 blocks will be read
0%....20%....40%....60%....80%....100% left 0, 11368 /sec
52919 directory entries were hashed with "r5" hash.
"r5" hash is selected
Flushing..finished
Read blocks (but not data blocks) 261466
Leaves among those 13086
Objectids found 53697
Pass 1 (will try to insert 13086 leaves):
####### Pass 1 #######
Looking for allocable blocks .. finished
0% left 12675, 0 /sec
The problem has occurred looks like a hardware problem (perhaps
memory). Send us the bug report only if the second run dies at
the same place with the same block number.
mark_block_used: (39508) used already
Aborted
Bad. But let's do it again as mentioned:
[...]
Flushing..finished
Read blocks (but not data blocks) 261466
Leaves among those 13085
Objectids found 54305
Pass 1 (will try to insert 13085 leaves):
####### Pass 1 #######
Looking for allocable blocks .. finished
0%... left 12127, 958 /sec
The problem has occurred looks like a hardware problem (perhaps
memory). Send us the bug report only if the second run dies at
the same place with the same block number.
build_the_tree: Nothing but leaves are expected. Block 196736 - internal
Aborted
Same happens every time, only the actual error message changes. Sometimes I get mark_block_used: (somenumber) used already
, other times the block number changes.
Seems like something is REALLY broken. Are there any chances I can somehow get the partitions to work again?
It's a server to which I don't have physical access directly (hosted server).
Thanks in advance!
Well, after a few more hours of
reiserfsck
ing it seems like repeating this three-step processsolves the problem eventually. I still don't know the cause for the problem as there seem to be no badblocks on any drive, neither do I know how much data is lost, but after all I am pretty sure that this should not happen. One partition is still "replaying its journal" but I will tell about the success (or failure) as soon as I can reboot the computer.