How worried should I be about my hard drive?
July 3, 2018 7:42 PM   Subscribe

I have a ThinkPad. I regularly run the diagnostic tool from Lenovo to check on the system health of the machine. Today, even though it said it got a 100% pass, the check gave me yellow warning indicators on the Smart Short Self Test and the Smart Drive Self Test. So what does this mean? Is a hard disk failure next?

This is the first time I got a yellow on the Smart Drive Self Test. A couple months back I got a warning on the Smart Short Self Test, but I thought it might be a fluke because the system check came on while the computer was in the midst of doing some heaving downloading. I stopped the download, ran the system again and got a perfect green pass. Since then, the scan has run regularly with all green passes.

If it means anything, the computer feels warmer than normal in the spot where I know the hard disk is located (below the keyboard to the left of the trackpad, where the palm of my hand and wrist are when my fingers are on the home position on the keyboard).

From what I can tell on the Internet, it sounds like a disk failure is in my future. Is this inevitable? Can I do something to prevent it? What steps or measures should I take?

If it matters, I have a 500GB hard drive on a Win 8.1 system being powered by an i7-4900MQ @ 2.8GHz.
posted by sardonyx to Computers & Internet (13 answers total) 4 users marked this as a favorite
First thing is first, does it say anywhere whether it's a read or a write head issue? Read heads are going to be more important to reading the data already on the drive and backing it up. Get the standard edition of CrystalDiskInfo here and use it as a second opinion. If it says something is wrong, something is wrong and proceed forth.

Get a backup of everything as soon as possible. This is not me scaring you, but warning you that a disk failure will happen. You'll need an external hard drive (of at least 500GB) and use the backup built into Windows to back up all of the drive. I don't have the steps in my head for Win8 anymore, but type Backup when you're in the Start pane, and it should show up.

Then, you can either wait until it fails, backing up every once in a while, or you can buy another hard drive, replace the current one and getting restore disks from Lenovo, and restore your files and programs. Make sure you have downloads squared away for programs you need, or disks that have the programs on it.

You could also try cloning the old drive to a new one, but that's hard to talk folks through. Any local computer repair place should be able to put a new hard drive in it, with some labor to clone the old one or reload stuff for you, if that's not something you feel comfortable doing yourself.
posted by deezil at 8:06 PM on July 3, 2018 [3 favorites]

Back up your data right now. Believe it when the drive tells you that it is failing. Expect it to fail at any time. Use it as little as possible until you can back everything up.

On this or another computer, download and create a bootable USB for Windows 8 or Windows 10. This is easy to do. Eightforums/Tenforums have great tutorials that even I can follow. I suggest using an 8 or 16GB USB 2.0 flash drive, as using a USB 3 drive has been a problem for some people on some machines.

Buy a new hard drive. If you can, take this opportunity to upgrade to an SSD. They are the best thing ever. I personally use Samsung, and the 860 EVO is very reasonably priced now, but Crucial MX500 has good reviews. Find some instructions for replacing your hard drive. Photos are good. For Thinkpads, this is usually easy to find.

Replace hard drive, install OS, update everything, put your files back on. Enjoy fresh clean OS feeling.

You might want to take this opportunity to see if there are any heat issues known with your model. I'd have checked, but you didn't specify.

The thinkpad subreddit has links to Lenovo support pages and manuals and other useful information in the sidebar. The Lenovo section of NotebookReview is not as comprehensive as the Dell section, but it doesn't need to be. However, they tend to get more technical than the subreddit, so that's where I'd look for people talking about thermal issues and advice on repasting if you think that it needs it.
posted by monopas at 8:29 PM on July 3, 2018 [6 favorites]

N'thing above... Do you care if you lose all the data on that drive permanently, at some unknown time in the not-distant future?

If the answer is anything other than a simple "no," back up and replace ASAP. It's not hard to DIY it (Clonezilla on a USB stick, and an external USB case for the replacement drive), but it should also be a straightforward job for any kind of shop that does this kind of thing. Thinkpads are generally designed for easy maintenance.
posted by doomsey at 8:33 PM on July 3, 2018 [1 favorite]

Admittedly, I am an alarmist about hard drive failure, because it seems like nobody ever backs up their data. If you have other computers available, then having your laptop hard drive fail may not inconvenience you terribly. But if this is your only computer and you depend upon it for your work, or your social life, or it has the only copies of your photos and the novel you've been writing for the past 5 years, I think it is better to be safe than sorry.
posted by monopas at 8:39 PM on July 3, 2018

N'thing getting an SSD. I had a ThinkPad get knocked off a footstool and lost a spinning drive, and had a ThinkPad go flying out of my bike bag after a spectacular bike crash without losing a bit. I still back up to spinning media, because it's cheap, but otherwise, SSD for life!
posted by dws at 9:05 PM on July 3, 2018

Response by poster: I'm pretty religious about backing up my data, but I'll make sure that everything is doubly stored elsewhere.

The laptop is a W540. Sadly, the report didn't give a lot of specifics, just a yellow warning triangle with an exclamation mark. That plus a nice little note saying that even if there were warning indicators it didn't actually mean there was a problem and that everything is really hunky dory, not to worry. (Which I don't really believe.) So the short answer is I don't know if it's a read head or a write head.

Is this the Crystal Disk Info site being recommended?
posted by sardonyx at 9:21 PM on July 3, 2018

Smart Short Self Test and the Smart Drive Self Test

I don't use Windows but are these the quick test that completes in less than 5 minutes and the slow one that reads the whole disk and takes hours, respectively?

If yes and if after the long test the SMART tool is still complaining, I'd be concerned. You already have backups and that's what matters most.

If not, do the extended SMART test, it sometimes finds bad sectors and remaps them internally so they won't be used again, and if successful your drive status goes right back to OK.

Anecdata: I have a 10+ year old drive where this happened still chugging along, and a 2 year old where the extended test failed that died in less than a week.
posted by Bangaioh at 1:57 AM on July 4, 2018

Response by poster: Both of the tests are part of an overall system scan that can be performed by Lenovo Solutions Centre. It supposedly tests everything from the fan to the disks to the processor. So I guess they're both the short ones as the whole process takes maybe 30 minutes (give or take, I've never timed it).
posted by sardonyx at 8:22 AM on July 4, 2018

Yes this is CrystalDiskInfo. You probably want the standard version in .zip form (unless you want 150MB of anime girls decorating the app. No, really.) Unless I'm missing something you can't actually initiate a SMART self test from this program, all it does is display SMART values. I suspect your Lenovo tool is telling SMART to run self tests.

Understanding SMART values is complicated, although Wikipedia gives you a good start. In general things like "Seek Error Rate" are reported on a 100-0 scale. 100 is perfect, 0 is dead, there's also a threshold below which the disk is considered to be really failing. Sometimes it's 200-0 instead. Modern disks are designed to absorb a few errors; it's OK if your Reallocated_Sector_Count is not a perfect score, that's expected.

In my experience it's the trend in errors you want to watch. A disk with a couple of failures may be OK, but a disk with a new failure every month needs to be replaced immediately.

You've already got backups of your data, but consider it'll take you a day or two to replace a dead drive and get the system bootable again.
posted by Nelson at 9:15 AM on July 4, 2018 [2 favorites]

In my experience - if you are getting warnings/errors, the disk is about to go. Especially if that drive is "spinning disk" tech and not SSD.

Myself, I recently got a weird boot disk failure error on my W530, when I had left a new USB accidentally in during a reboot... I *hope* it's not my internal drives failing (because they are RAID0, so when they go, no data will be recoverable), so I am in the process of refreshing my backups...

... if you drop in a new SSD hard drive into the unit, you will get at least another year out of that machine, possibly even two-three - the main thing is actually having/making install media (and having keys/license codes) for your operating system and software.
posted by jkaczor at 10:02 AM on July 4, 2018

Response by poster: Okay, here's the CrystalDisk report if anybody is interested: the short version is everything is green and looks okay (I think). (Yes, I know I have too much data on the disk and that I'm well beyond the suggested threshold. I'm slowly working to resolve that issue and reorganize my filing system to allow me to store more data on secondary drives.)

Sequential Read (Q= 32,T= 1) : 99.382 MB/s
Sequential Write (Q= 32,T= 1) : 92.325 MB/s
Random Read 4KiB (Q= 8,T= 8) : 0.623 MB/s [ 152.1 IOPS]
Random Write 4KiB (Q= 8,T= 8) : 0.551 MB/s [ 134.5 IOPS]
Random Read 4KiB (Q= 32,T= 1) : 0.575 MB/s [ 140.4 IOPS]
Random Write 4KiB (Q= 32,T= 1) : 0.533 MB/s [ 130.1 IOPS]
Random Read 4KiB (Q= 1,T= 1) : 0.269 MB/s [ 65.7 IOPS]
Random Write 4KiB (Q= 1,T= 1) : 0.554 MB/s [ 135.3 IOPS]

Test : 1024 MiB [C: 91.8% (413.2/450.3 GiB)] (x5) [Interval=5 sec]
Date : 2018/07/04 13:06:20
OS : Windows 8.1 Pro [6.3 Build 9600] (x64)
posted by sardonyx at 10:09 AM on July 4, 2018

if you are getting warnings/errors, the disk is about to go. Especially if that drive is "spinning disk" tech and not SSD.

About the only advantage that spinnies have over SSDs (apart from being a tenth of the price per gigabyte) is that they do give you early warning of failure. Unlike SSDs, spinnies tend to grow increasing numbers of bad blocks before they die completely; clone and replace a spinny when it starts to grow bad blocks and typically all you need to do is recover a handful of affected files from backups.

On a Windows box I will generally use PassMark DiskCheckup to interrogate the SMART logs for drives I suspect, and look for an excessive raw value (more than a few tens) for Reallocated Sector Count. DiskCheckup can also kick off an internal SMART self-test.

I *hope* it's not my internal drives failing (because they are RAID0, so when they go, no data will be recoverable

That's over-pessimistic unless they're SSDs, in which case putting them in RAID0 would be a little Can=Must for a personal workstation. Sure, if one of them dies in such a way as to render it completely inaccessible you're unlikely to be able to get much that's meaningful back from the other, but if you clone and replace either one individually before it dies completely you'll lose a few blocks at worst.
posted by flabdablet at 2:10 AM on July 5, 2018 [1 favorite]

That's over-pessimistic unless they're SSDs, in which case putting them in RAID0 would be a little Can=Must for a personal workstation.

Heh... they are SSD's... RAID0 for "speed" and in theory to reduce the overall "wear-leveling" by splitting activities across two drives - I've had this hardware since 2014 and it still cold boots to login in about 3 seconds. Run many many VM's and everything is always lightning fast. (Yes, overkill... this "laptop" also has 32gb RAM, but is still going strong - the only thing that needs replacing is the trackpad/touchpad)

But... I also have a 5-bay Synology where I back-up everything (and it has iSCSI, so my VM data is kept there) - and drives do tend to start failing every 2-3 years, when they do, I start replacing them, one every 4-6 weeks - until I have a whole new batch. The nice thing is that my actual capacity keeps growing every few years. I highly recomend a NAS, I have had both QNAP and Synology - admitedly, I prefer the latter.
posted by jkaczor at 10:35 AM on July 5, 2018 [1 favorite]

« Older Gaining confidence after sheltered childhood   |   Science plaything identification Newer »
This thread is closed to new comments.