I've been having trouble with a debian system's file system going into read-only mode. The short solution was to turn it off and back on again, but it keeps switching over to read only mode. The debian machine is a VM on an ESXi hypervisor, and the volume is exported in /etc/exports so that other VMs can mount its subfolders in their own /etc/fstab
configurations.
Client /etc/fstab
configs:
# Remember NFS isn't secured over the network.. it probably wouldn't matter but is still considered non-secure
storage-host-vm:/storage/drv_a/workpro_backups_samba /mnt/workpro_backups_samba nfs rw,sync,hard,intr 0 0
storage-host-vm:/storage/drv_b/temporary /mnt/temporary nfs rw,sync,hard,intr 0 0
storage-host-vm:/storage/drv_a/staff_pc_file_backups /mnt/staff_pc_file_backups nfs rw,sync,hard,intr 0 0
I just ran fsck
and it gave the below output
$ sudo fsck /storage/drv_a
fsck from util-linux 2.20.1
e2fsck 1.42.5 (29-Jul-2012)
/dev/sdd1 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdd1: 264980/134217728 files (0.2% non-contiguous), 18092864/536870655 blocks
So did I just fix the fluke problem? Or should I be shopping out a new hdd and deciding a data migration strategy since the drive is dieing? Or is the fact that I'm exporting
The hard drive is "on" a 2TB 5400RPM Green Western Digital. I think it must have worked fine originally when it was first integrated into the system... It's purpose was for storing clonezilla HDD images (I DON'T make those often for obvious reasons ;) and holds a bunch of ISOs. It also has a git repo that is committed to nightly for a legacy machine's backup purposes. The drive wasn't by any means 'high end', but I'm still a bit frustrated because I've only used about 2% of this drive's capacity so far and would expect slightly better performance. Is there a way I can "shut down" all the sectors that I've written to so far since I've apparently used them up? Or what kind of a failure is this? Its in a very well vented chasis with limited vibration and interferance.
Here's some highlights from /var/log/messages
[105762.692329] EXT4-fs warning (device sdd1): ext4_clear_journal_err:4365: Filesystem error recorded from previous mount: IO failure
[105762.692334] EXT4-fs warning (device sdd1): ext4_clear_journal_err:4366: Marking fs in need of filesystem check.
[105762.695164] EXT4-fs (sdd1): warning: mounting fs with errors, running e2fsck is recommended
[105762.793436] EXT4-fs (sdd1): recovery complete
[119886.884295] sd 0:0:3:0: [sdd] Unhandled error code
[119886.884299] sd 0:0:3:0: [sdd] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[119886.884304] sd 0:0:3:0: [sdd] CDB: Read(10): 28 00 72 8c 40 e0 00 00 90 00
[119886.884340] sd 0:0:3:0: [sdd] Unhandled sense code
[119886.884342] sd 0:0:3:0: [sdd] Result: hostbyte=invalid driverbyte=DRIVER_SENSE
[119886.884345] sd 0:0:3:0: [sdd] Sense Key : Medium Error [current]
[119886.884349] sd 0:0:3:0: [sdd] Add. Sense: Unrecovered read error - auto reallocate failed
[119886.884354] sd 0:0:3:0: [sdd] CDB: Read(10): 28 00 72 8c 3f e0 00 01 00 00
[119890.019089] sd 0:0:3:0: [sdd] Unhandled sense code
[119890.019096] sd 0:0:3:0: [sdd] Result: hostbyte=invalid driverbyte=DRIVER_SENSE
[119890.019102] sd 0:0:3:0: [sdd] Sense Key : Medium Error [current]
[119890.019108] sd 0:0:3:0: [sdd] Add. Sense: Unrecovered read error - auto reallocate failed
Hard drive is having errors as suggested in another answer, get it replaced.
If you need the data it may be worthwhile to attempt a backup. Take the disk out and attached it to a different system. Then use (a variation of) dd to make a copy.
You could use ddrescue:
To find other useful guides on how to use it just do a quick https://www.duckduckgo.com search.
Alternatively the following dd command did help me out a couple of times when dealing with a bad disk. You can either use it only on those partitions you want to save, or the whole disk:
Use conv=noerror to avoid it stopping on errors, the amount of memory to add to the bs= argument should be about 1/4 to 1/2 of the RAM. In order to speed things up. The default is too low and causes dd to slow down a lot because it does too many disk operations. You may need to perform a filesystem check on the recovered image(s). If the file diskimage.img contains one partition you can mount it using: