I have an external USB hard drive, consisting of an 1 TB SATA drive in a Rosewill RX35-AT-SU SLV Aluminum 3.5" Silver USB 2.0 External Enclosure, plugged into my SONY VAIO VGN-NS310F laptop. It is plugged directly into the computer (not through a hub). The drive inside the enclosure is a 7200 rpm Western Digital, but I don't remember the exact model. I can remove the drive from the enclosure (again), if people think it's necessary to know that detail.
The drive is formatted ext4. I mount it dynamically with udisks
on my Lubuntu 11.10 system, usually automatically via PCManFM. (I have had Lubuntu 12.04 on this machine, and experienced all this same behavior with that too.) Every once in a while--once or twice a day--it becomes inaccessible, and difficult to unmount. Attempting to unmount it with sudo umount ...
gives an error message saying the drive is in use and suggesting fuser
and lsof
to find out what is using it. Killing processes found to be using the drive with fuser
and lsof
is sometimes sufficient to let me unmount it, but usually isn't.
Once the drive is unmounted or the machine is rebooted, the drive will not mount. Plugging in the drive and turning it on registers nothing on the computer. dmesg
is unchanged. The drive's access light usually blinks vigorously, as though the drive is being accessed constantly. Then eventually, after I keep the drive off for a while (half an hour), I am able to mount it again.
While the drive doesn't work on this machine for a while, it will work immediately on another machine running the same version of Ubuntu. Sometimes bringing it back over from the other machine seems to "fix" it. Sometimes it doesn't.
The drive doesn't always stop being accessible while mounted, before becoming unmountable. Sometimes it works fine, I turn off the computer, I turn the computer back on, and I cannot mount the drive.
Currently this is the only drive with which I have this problem, but I've had problems that I think are the same as this, with different drives, on different Ubuntu machines. This laptop has another external USB drive plugged into it regularly, which doesn't have this problem. Unplugging that drive before plugging in the "problem" drive doesn't fix the problem.
I've opened the drive up and made sure the connections were tight in the past, and that didn't seem to help (any more than waiting the same amount of time that it took to open and close the drive, before attempting to remount it).
Does anyone have any ideas about what could be causing this, what troubleshooting steps I should perform, and/or how I could fix this problem altogether?
Update: I tried replacing the USB data cable (from the enclosure to the laptop), as Merlin suggested. I should've tried that long ago, since it fits the symptoms perfectly (the drive works on another machine, which would make sense because the cable would be bent at a different angle, possibly completing a circuit of frayed wires). Unfortunately, though, this did not help--I have the same problem with the new cable. I'll try to provide additional detailed information about the drive inside the enclosure, next time I'm able to get the drive working. (At the moment I don't have another machine available to attach it.)
Major Update (28 June 2012)
The drive seems to have deteriorated considerably. I think this is so, because I've attached it to another machine and gotten lots of errors about invalid characters, when copying files from it. I am less interested in recovering data from the drive than I am in figuring out what is wrong with it. I specifically want to figure out if the problem is the drive or the enclosure.
Now, when I plug the drive into the original machine where I was having the problems, it still doesn't appear (including with sudo fdisk -l
), but it is recognized by the kernel and messages are added to dmesg
. Most of the message consist of errors like this, repeated many times:
[ 7.707593] sd 5:0:0:0: [sdc] Unhandled sense code
[ 7.707599] sd 5:0:0:0: [sdc] Result: hostbyte=invalid driverbyte=DRIVER_SENSE
[ 7.707606] sd 5:0:0:0: [sdc] Sense Key : Medium Error [current]
[ 7.707614] sd 5:0:0:0: [sdc] Add. Sense: Unrecovered read error
[ 7.707621] sd 5:0:0:0: [sdc] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
[ 7.707636] end_request: critical target error, dev sdc, sector 0
[ 7.707641] Buffer I/O error on device sdc, logical block 0
Here are all the lines from dmesg
starting with when the drive is recognized. Please note that:
- I'm back to running Lubuntu 12.04 on this machine (and perhaps that's a factor in better error messages).
- Now that the drive has been plugged into another machine and back into this one, and also now that this machine is back to running 12.04, the drive's access light doesn't blink as I had described. Looking at the drive, it would appear as though it is working normally, with low or no access.
- This behavior (the errors) occurs when rebooting the machine with the drive plugged in, and also when manually plugging in the drive.
- A few of the messages are about
/dev/sdb
. That drive is working fine. The bad drive is/dev/sdc
. I just didn't want to edit anything out from the middle.
To determine whether the problem is the drive or the enclosure, remove the drive from the enclosure, install it in a desktop with sufficient power and check the smart status.
For a deeper test, you can check every sector of the drive utilizing tools like
ddrescue
.ddrescue
will report error size during the process and you can attempt data recovery at the same time as in:sudo ddrescue /dev/sdb2 /path/to/recovery.image logfile
. List the partitions withsudo lsblk
or classicfdisk -l
.If you truly have no interest in the data you can force the output file to
/dev/null
as in:and you'll still get a report on any error size on stdout
Tested on Ubuntu 14.04 with GNU ddrescue 1.17 as follows inthis short example using /dev/sdb2 (a 1MB swap partition)
I had similar experiences when I was running 12.04 Ubuntu desktop. My hard drive enclosure had 2 options for power, I could either buy an a/c adapter or use 1 mini usb to normal usb cable, or use a mini usb cable into two usb cables. Ideally it needed to be connected to both usb ports to supply ample power. It could transfer data over the usb cable or over eSata.
When using the eSata connection I needed to first supply the drive power so the disk was spinning and then boot the system so the bios would recognize the already spinning disk. Otherwise it would not see the disk in time. I believe this has something to do with the controller for the enclosure.
When I mounted the USB I had very mixed results when plugging the cable first into the enclosure and second into the USB ports. Maybe about half of the time it would mount correctly. If I plugged the usb cables into the pc ports first and then into the enclosure second I had much better results at around 70%. The best results I got with the USB options was using an external power source(a/c adapter) for the enclosure to make sure the disk was spinning and stable before I plugged it in to the machine. Worked pretty much 100%
Not saying that this is exactly your issue, but for me it helped to provide the enclosure power and have the disk spinning before connecting it to have it read be the system. Perhaps the bios or bus speed on your one system is better than the other and it give time for the enclosures controller to start working before it tries to read the disk? And perhaps some time after the enclosure is unplugged the controller resets itself?
Maybe the controller needs to time to decide whether it is just getting power from the USB's or power and data. Maybe its a voltage or amperage issue? Eitherway enclosure controllers seem to be finicky.
For your USB drive try the following steps(if you have not done this already):
sudo fdisk -l
#get infosudo mkdir /media/external
#create mount pointsudo mount -t vfat /dev/sdb1 /media/external -o uid=1000,gid=1000,utf8,dmask=027,fmask=137
#mount.Or try using
pmount
for mounting your USB.I would suggest there could also be a problem with your USB bus driver chip or similar. Do you have any tools to list all USB devices? Try running those tools when the drive is working properly, and when the drive is inaccessible. Do you see any differences?
Older (and cheaper) drives go through a process called "Thermal Recalibration" (TACL) which happens every hour or so, and they can become inaccessible for several seconds while the drive estimates how much the disk head is being bent by heat build-up in the drive. During TCL the stepper motor and coil seek to every track and the head is aligned on every track, and the results are stored. This is an internal feature of the firmware. It sounds like this process is either getting stuck, or perhaps it is producing the wrong answers, making it impossible to access the drive after thermal recalibration.
The error you posted from dmesg indicates there are problems on sdc, sector 0, logical block 0. These low-numbered blocks often contain the geometry of the drive (i.e. the hard or soft formatting). If these blocks are going bad, the whole drive may become inaccessible, permanently. The media failure may be heat related, which might explain why a period of inactivity (moving the drive to another machine) sometimes fixes it, sometimes doesn't fix it.
To figure out if the problem is thermal-related, turn on the computer and start a stopwatch, but don't really use the drive - just wait for it to fail, and record how long it takes to fail. Then, turn it off and leave it off for several hours to cool down, then re-run the test, turn on the computer and drive, but start a huge data-intensive drive-drive copy (same drive). Doing more work with the stepper motor will presumably cause the drive to heat up quicker, and cause it to fail sooner. If there is a big change in failure delay, then the drive is toast and I would get another one. Good luck!
Quite often this type of problem is caused by lack of sufficient power coming down the USB cable to the drive, and this is particularly likely where the external drive was not bought off-the-shelf but self-assembled. (You would hope that a manufacturer of an external drive would have made sure that USB ports could support it.)
A device may draw up to 500 mA from a port in the USB 2.0 spec and up to 900 mA in USB 3.0. By checking the manufacturer's specifications of the external drive you might be able to confirm the maximum power requirements of your drive.
The problem can often by fixed by trying a USB3 port (if you have one, and you haven't already tried this), because they provide more power than USB2, or to get a USB Y cable so that the drive can get power from 2 ports instead of 1. These are available inexpensively on eBay or Amazon.
Although the real cause of the problem was already pointed out, I want to add the same answer, since I have some 4 external HDDs.
Any computer that is manufactured, assumes that the power source will be used accordingly to the specifications of the configuration and at most an overload of 20%
Any external USB device HAS TO BE POWERED FROM AN EXTERNAL POWERED HUB, in order to protect the computer's power source. The behaviour described is a typical behaviour for power overload. Supposing you have also an external DVD, this would fail your recordings also and even might render irrecoverable your device(s) and further more, even your computer. Laptops usually brake this way, since users tend to use passive USB hubs or computer-powered devices, including HDDs, DVDs and the like.
Buy an external powered USB hub and connect the USB devices THROUGH A POWERED USB PORT, instead of draining the power from computer's source, since this approach will damage more than your HDD. A power surge is mostly the same as an underpowered computer or external device. The USB standard has nothing to do with the underpowered device. Think of it like this: if you want power from your car, what fuel would you use? Anything that burns, or the manufacturer's specified fuel? It is exactly the same here. Using UNPOWERED USB devices, drains power from the internal power source of the computer. It already has enough devices attached to it!