I hope AskMe 120389 has replies within the specified period
April 24, 2009 6:34 AM   Subscribe

My replacement hard drive went bad after just one day. I think the motherboard's the culprit. Need informed opinions.

Long story follows--> Computer X (mobo G31M-S2L) had two hard drives A & B. A's first partition has XP SP2. B's first partition had "Program Files" (due to a custom XP install that went not quite as intended). About a month back, two things started happening: 1)Opening the Explorer replacement file manager started taking some time (30 secs), as opposed to happening almost immediately. 2)I started getting random but infrequent BSOD for "Page fault in non-paged area". So I ran a few passes of Memtest86+ and found no errors. I also successfully ran the CPU stress test included in the handy boot CD I had. I never had an application crash on me during some intensive task or otherwise. So I looked up the logs and found a huge number of errors coded 9 & 7 (ideport timeout & bad block). I connected HDD B to another SATA connector using a new cable, but the errors persisted. So I ran full diagnostics on both drives using the manufacturer's utilities. As expected, A (Hitachi) showed no errors and B (Seagate) showed quite a few errors. I asked my dealer to get B replaced under warranty. Its replacement C arrived day before yesterday. In the interim, X ran fine without any hiccups i.e. no BSOD or temporary freezes..etc. First thing I did upon getting C was run SeaTools Long Test, just like with B. No errors. So that night, I started to backup data from A onto C, so I could perform a proper reinstall of XP after repartitioning A. At first, the copying progressed smoothly, but then events similar to with B recurred. A 700 MB file would start copying normally at 32MB/s, but would stop in the middle for 10 seconds and then resume copying without further halts. Some files would copy at under 1MB/s. After one batch of files had been copied, I decided to copy them back from B onto some temp space on A and unsurprisingly I got CRC errors on quite a few but not all. I ran SeaTools Long again and this time 26 'errors' showed up (and were reported as repaired successfully). Surprised that an apparently error-free HDD should go bad so soon, I placed C in computer Y and played around with transferring data to & fro. After shuffling about 100+ GB, there's been no hint of error.

I arrived at the inference that the new HDD C isn't bad after all, and probably neither was B (in the initial stages atleast). I should point out that I checked SMART periodically and never has it been reported as tripped by A, B or C.

So, I see the possibilities as

1)problem with XP drivers (unlikely, since B was working fine before)
2)problem developed in the motherboard (most likely)
3)the new drive C is indeed bad of its own craftmanship.
4)power supply problems in X (unlikely; the other drive or devices have never shown a problem)
posted by Gyan to Computers & Internet (7 answers total)
 
I suppose it's possible you have a power supply issue that causes the HDD to not complete reads and writes successfully. Or a bad controller on the motherboard.

When you run Seatools (and the like) and it shows errors and that they have been repaired, that means that it found bad sectors on the drive and was able to remap those sectors. That's not inherently, automatically bad. But it's not confidence inspiring either. As I understand it, all drives have bad sectors from the factory. But during the manufacturing process, those sectors are remapped and hidden away. A new drive shouldn't continue to get them. This indicates a growing problem, to me. I *believe* that those tools are smart enough to know the difference between computer failures and hard drive failures.

What I'd do is run something like dban on the drive. Run seatools, verify it's clean. Run dban. Run seatools again and make sure it stayed clean. If you get different results on computer X and Y, then the problem is in the computer, not the drive.

My understanding is that CRC errors are communication-related. The cable, or the controllers on the drive or motherboard, are showing errors in passing data. I believe on-disk CRC errors are handled in the background and are handled by triggering the bad sector routine.
posted by gjc at 7:02 AM on April 24, 2009


I had a PC that was thankfully under warranty that blew a hard drive, then a motherboard because of a short in the power supply. The blowouts had all the makings of a motherboard problem and threw the techies for a loop until they were testing the new motherboard and discovered the problem.
posted by Pollomacho at 8:02 AM on April 24, 2009


The only thing I'll add to the above is, when it comes to these strange hardware behaviors, it can be anything. Most often it's RAM, the motherboard, or the power supply. These are the big 3 people usually dismiss as not being the problem. If you have another mobo, try some testing.
posted by teabag at 8:14 AM on April 24, 2009


I asked my dealer to get B replaced under warranty. Its replacement C arrived day before yesterday.

Maybe you got a refurbed disk or a bad disk. Im guessing that Seatools marked the bad sectors already so when you moved it to the other machine the filesystem knew to avoid them, thus the lack of errors.

I arrived at the inference that the new HDD C isn't bad after all

How many bad sectors does it have?
posted by damn dirty ape at 9:25 AM on April 24, 2009


Out of interest, did you fit the same power supply cable to both B and C in X? I've had a lot more failures/problems of drives that were power supply related than sata controller. Also, some sata hard-drives work better on a real sata power connector, rather than a molex or molex-to-sata adapter, as the latter don't have the 3.3 V line.

May be worth swapping the power leads for A & C (or borrowing the DVD drive one) to see if that helps; modern power supplies have multiple rails, and B may be on a bad or overloaded rail.

That you're having no problems with A with B gone indicates that your drivers are OK, though that was my initial thought (Page fault in non-paged area is almost always a driver error, though a disk read error logically could also cause it).

You've moved B from one sata point to another, so it seems unlikely that two or more would fail while leaving A's working fine, though it's certainly possible. Try C on A's sata point onboard in X, and test with a boot CD.

Certainly, it's possible to get back a bad refurb'd drive - seagate's warranty replacements are not new drives, but ones sent in for repair previously (though a majority aren't actually faulty in the first place, the problem lies elsewhere) SMART errors would indicate bad sectors on the drive itself; CRC errors are transmission errors, and almost invariably due to faults other than mechanical/magnetic failure on the platters.

Would be interesting to know what errors the seatools long test picked up, i.e. if they were bad sectors.
posted by ArkhanJG at 9:54 AM on April 24, 2009


Response by poster: Thanks for all the replies.

ArkhanJG, I'll definitely switch onto another rail for HDD C and check. Hope it's that simple.

Say, if I want to test C from a Linux CD, I'll need NTFS read/write abilities. What's the current state of that? And which distro should I download for this purpose?
posted by Gyan at 4:39 AM on April 25, 2009


ntfs-3g is a mature solution for read/write of NTFS partitions under linux - I've used it a number of times to recover or remove files from partitions windows itself was having trouble with. It's in pretty much every distro, so any recent live cd should be able to do it.

Personally, I use systemrescuecd for such testing.
posted by ArkhanJG at 9:30 PM on April 25, 2009


« Older Recommend a rye whiskey   |   Why is my stomach so big? Newer »
This thread is closed to new comments.