I have a few hard drives in mdadm RAID 5 configured to go to standby after a few minutes of inactivity. (Using hdparm.conf spindown_time
.)
At irregular intervals I get messages like these in dmesg
:
[ 1840.251661] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 1840.251722] ata4.00: failed command: SMART
[ 1840.251758] ata4.00: cmd b0/d5:01:06:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[ 1840.251759] res 40/00:14:50:2e:04/00:00:02:00:00/40 Emask 0x4 (timeout)
[ 1840.251858] ata4.00: status: { DRDY }
[ 1840.251888] ata4: hard resetting link
[ 1840.600742] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1840.601521] ata4.00: configured for UDMA/133
[ 1840.601547] ata4: EH complete
[337877.713988] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[337877.714019] ata4.00: failed command: SMART
[337877.714038] ata4.00: cmd b0/d5:01:06:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[337877.714039] res 40/00:04:90:10:81/00:00:00:00:00/40 Emask 0x4 (timeout)
[337877.714089] ata4.00: status: { DRDY }
[337877.714107] ata4: hard resetting link
[337878.063085] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[337878.063743] ata4.00: configured for UDMA/133
[337878.063764] ata4: EH complete
I think the exception is caused by smartd
when a drive does not wake up quickly enough.
There are no issues (that I can tell) in accessing the drives normally through the file system - it takes a few seconds longer than normal when they are asleep, but there are no exceptions.
Is this something I should worry about, as a potential symptom on something that could corrupt a drive over time?
Or can I safely ignore it as part of normal operation?
Edit:
By request: smartctl -a
for sda
and sde
, both disks are members of the array.
If ata4
is the same as scsi-4
then sde
is the one that gave the error above, according to /dev/disk/by-path
.
Provide the output of the command
smartctl -a /dev/sda
(replacesda
with all disks) and post on http://paste.ubuntu.com/). This will show whether your disks are trying too often to spin down in order to save energy (which may damage them). In addition, it will show other information such as temperature, and bad sectors.I had the same error message popping up in my Armbian running on Banana Pi. It turned out, I had a broken 5 volt pin in the molex cable powering the hard disk.
Oddly enough my 3.5 inch hardisk ran with just 12 volt supply working, but was constantly showing the same error as yours.
So if your Hard Disk passes the test, I recommend checking the power supply, try connecting the hard disk with a different sata power cable.
Hopes this saves some time , for anyone who faces such weird issues.