I have a NTFS disk where I store my data, after a problem with a bad IDE/SATA adapter (who was shutting down my disk) the S.M.A.R.T is showing errors.
I want to know what more I need to do to check and fix any error at this disk.
I've used fsck to check the disk but its not verbosely to me.
andre@PITCAIRN:~$ sudo fsck /dev/sdb1
fsck from util-linux 2.20.1
Mounting volume... OK
Processing of $MFT and $MFTMirr completed successfully.
Checking the alternate boot sector... OK
NTFS volume version is 3.1.
NTFS partition /dev/sdb1 was processed successfully.
The S.M.A.R.T
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 115 099 006 Pre-fail Always - 95292924
3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 097 097 020 Old_age Always - 3419
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 067 060 030 Pre-fail Always - 5425551
9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 6345
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1501
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 046 046 000 Old_age Always - 54
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 062 048 045 Old_age Always - 38 (Min/Max 31/39)
194 Temperature_Celsius 0x0022 038 052 000 Old_age Always - 38 (0 19 0 0 0)
195 Hardware_ECC_Recovered 0x001a 041 022 000 Old_age Always - 95292924
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 20
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 20
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 96499325213315
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 2999278438
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 866573403
And the error at S.M.A.R.T
SMART Error Log Version: 1
ATA Error Count: 54 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 54 occurred at disk power-on lifetime: 6088 hours (253 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 10:17:38.985 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:38.983 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:38.971 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:38.970 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:38.970 READ FPDMA QUEUED
Error 53 occurred at disk power-on lifetime: 6088 hours (253 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 10:17:35.999 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:35.999 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:35.998 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:35.998 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:35.998 READ FPDMA QUEUED
Error 52 occurred at disk power-on lifetime: 6088 hours (253 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 10:17:29.920 READ FPDMA QUEUED
27 00 00 00 00 00 e0 00 10:17:29.918 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 10:17:29.909 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 00 10:17:29.909 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 10:17:29.909 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 51 occurred at disk power-on lifetime: 6088 hours (253 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 10:17:27.106 READ FPDMA QUEUED
27 00 00 00 00 00 e0 00 10:17:27.104 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 10:17:27.095 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 00 10:17:27.095 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 10:17:27.095 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 50 occurred at disk power-on lifetime: 6088 hours (253 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 10:17:24.293 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:24.279 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:24.279 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:24.279 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 10:17:24.279 READ FPDMA QUEUED
I'm running the extended self test at this disk with it UNMOUNTED.
You have 20 bad sectors on the drive. They may simply have become corrupt for example, due to sudden power loss in the middle of a write. You can try to write zeros to them and see if they come back. You will need to identify the sector numbers in question, which you can see from the error log there, the first one is 268435455. First, try to read it to verify it is bad:
If this is a 4k sector drive, use 4096 for bs= instead of 512. This should give an error. If it does, write it with zeros:
Double check the command before hitting enter; if you don't get it exactly right, you can destroy data.
Repeat this for each of the sectors in the error log, then check the SMART status again. The pending count should go down. If the reallocated count goes up, then the sectors were physically damaged and you should replace the drive. If not, it should be fine. You might try running the long SMART selftest to find more bad sectors.
You've already done anything that's necessary to repair the drive. SMART is a monitoring system that is a feature of the hard drive controller, so the only thing the operating system does is showing the data collected by the SMART system. You can't reset it, so if SMART has detected something that is considered as wrong it will be remembered forever. Errors detected by SMART are not necessarily a sign of a failing drive, however it is possible that there is indeed something wrong. In any case make sure you create good backups which you should do anyway.