We all know that SSDs have a limited predetermined life span. So the question for me is how do I check in (Ubuntu) Linux what the current health status of my SSD is? And maybe an estimation how long it will take?
Graphical tool is preferred, but command line tool would also be fine.
I'm using Xubuntu 12.04 LTS
to check the health of a SSD
For Ubuntu, Mint, or Debian based distributions
The Media_Wearout_Indicator is what you are looking for. For 100 means your ssd has 100% life, the lower number means less life left.
To show your sdd information
You can read the complete article at Nam Huy Linux Blog - How to check SSD life left on linux
Install Gnome Disk Utility and check SMART Data and Tests for wear-leveling-count or similar. The higher that number (%, from 1 to 100), the more "used up" your SSD is, which means you are more likely to have problems. But if you have a recent SSD, you need not worry about it.
Installed via
start it via
either menu->Settings->Disk utility
or via command line
If you don't have an Intel-brand SSD: READ THIS.
Watch out !! -- I was blithely mislead by 'smartmontools.' I have a Samsung SSD, and the smartmonitor/'smartctl' tool happily misreported that '233' (hex 'E9') attribute was 'Media_Wearout_Indicator'; in fact -- no, for Samsung (and other manufacturers) it is up to entirely different. This and other forum postings, stack-exchange question/answers, and power-user blogs I found seem to be 'Intel focused,' with only vague hints that 'it may vary.' (Versus any suggestion that you need to watch out for wrong and erroneous labeling of the attribute by smartmontools).
As I was preparing to copy my SSD to a new harddrive I'd bought (because of what smartmontools had told me), I booted to windows (I have a dual boot system), to learn something about SSD's from what the windows-only Samsung tool 'Samsung_Magician_v43.exe' had to tell me about my drive -- it was shockingly uninformative.
After what's been hours of digging - I've finally been able to run the windows only tools: hddgaurdian, and then also CrystalDiskInfo: Surprise! both tools independently tell me my Samsung SSD is 'just fine' (hdd guardian says '5 stars' and Crystal Disk "98% OK"). By contrast the smartctl tool explicitly labeled the attribute with 'decimal- 233 / 'hex- E9' as "Media Wearout Indicator" -- and told me its value was "1" or 1% -- an indicator of (the risk of) pending failure. To be as sure as I can, I dug and dug and was finally able to locate at least something from Samsung official: Samsung White Paper 07: Communicating With Your SSD [archive.org]
The document indeed implies that the attribute 'hex E9' /'decimal '233' is not used by Samsung the same way. ( Samsung: I'm very disappointed, please either fix your official software-tool, or at least make it clear that you do not provide wear out indication information!)
Further - if you have neither an Intel SSD nor Samsung SSD - be warned, this info does seem to vary across manufacturers. ( e.g. see the attribute label chart on https://code.google.com/p/hddguardian/wiki/about_reliability for the only useful indication of the degree of variability that I found. )
The so-what: If you don't have an Intel SSD-- do not be mislead by the false attribute name labels provided by smartmonitor. Perhaps it will improve in the future, but the version installed by default for Ubuntu 12.04 LTS (April, 2014) was total fail. Instead of telling you it 'doesn't know' -- smartctl just mislabeled the attribute. I did not find another tool for linux that made the 'correct' information transparent or clear.
For (at least some) NVMe drives, you can do
You can then look for a line like:
Here lower numbers are better and
100%
means the drive is "worn out". Manufacturer documentation suggests that it is possible to get numbers above 100% if you keep using the drive beyond this point (example from Seagate, see page 12).Note that if you use the namespace or partition devices, like
/dev/nvme0n1
or/dev/nvme0n1p1
, it won't work and you will instead get a message likeRead NVMe SMART/Health Information failed: NVMe Status 0x4002
.For Kingston drives on Debian-based computers
Similar to this answer execute
However when I execute the command to show the drive info, it looks like SMART was disabled:
You need to enable that by executing the following as root:
You can then execute a self-test by doing either a short test (which took me about 1 minute):
or a more thorough test (which took me about 1.5 hours):
Note, in most circumstances you do not need to unmount the drive to execute these tests. If you do, see
man smartctl
.Now, when you execute
smartctl -a /dev/sda
you should then see a self-assessment test result. This is probably all you really need to concern yourself with:If you like details, you will also see a table like this:
If you are looking for what all of these values mean, see the Kingston documentation.
Wear_Leveling_Count is the right attribute to track. However, like the other attributes, 100 is the BEST value and 0 is the WORST. Think of it as "percent life remaining".
The best way to check the health of an SSD is to follow the manufacturers recommendations for doing so. As these vary from manufacturer to manufacturer and may change over time, it's a good idea to check with your drives manufacturer if you have concerns. Based on MTBF ratings (the JEDEC JESD218A standard defines the method) provided by most manufacturers an SSD should last well over a million hours without a problem.
I have several of these covering several manufacturers. I can guarantee that the SMART attributes vary between manufacturers. For comparison purposes here's an example from OCZ and smart data from a Corsair F40 unit along with a discussion regarding how unreliable this data is.
While SMART data can certainly have value, since all devices fail eventually, the important thing is that you back up your data regularly. This provides peace of mind that your data is safe while you wait (likely for several years) for your SSD to fail. As costs drop and capacities rise, it's more likely that you'll replace an SSD due to space contraints than to failure. (In my experience 10x more likely). I would simply backup regularly and not worry about it.
Sources:
Experience, http://www.hardcoreware.net/mtbf-ssd-what-does-it-mean-for-you/
For my SSD drive (
hdparm
printsModel Number: CT480BX500SSD1
) the parameter name wasPercent_Lifetime_Remain
, i.e.has showed:
I'm using this system for ~4 months, quite actively (backend software development), and I've got 2% off the lifetime so far. Maybe I should think of better SSD.