My RAID isn't rebuilding.
April 13, 2007 3:53 PM   Subscribe

My RAID is failing to rebuild: help! Every time I try and rebuild it with a new drive, the rebuilding fails.

I have a Raid-5 drive on my home server, built around a MegaRAID SATA 150-6D card with 4 drives attached (WD500KS 500GB drives). This has been working well, but one of the drives failed. No problem, I thought. I’ll swap it out with a new one. I did that, but that new drive failed to rebuild. So, I tried another one (this one is a Maxtor drive; I couldn’t get hold of one of the Western digital ones). Same thing; It failed again. I tried another maxtor drive, and it failed again. I'm stumped on this. I’ve attached the appropriate section of the log file below; anybody got any ideas?


Notify message : DRIVE STATE changed in Port 1 ID 1 to REBUILD - Thu Apr 12 16:47:09 2007
Error on Rebuilding PORT 1 TARG 1 - Fri Apr 13 14:42:45 2007
Notify message : DRIVE STATE changed in Port 1 ID 1 to FAILED - Fri Apr 13 14:42:46 2007
NOTIFY:Check Condition on Ch 1 ID 2 with the following sense key - Fri Apr 13 15:35:54 2007

Time Stamp Date = On Apr,13 2007 At 15:8:2

The CDB = 28 00 38 8f ab 00 00 00 80 00

Sense Data = 70 00 03 00 00 00 00 0b 00 00 00 00 11 00 00 00 00 00
NOTIFY:Check Condition on Ch 1 ID 2 with the following sense key - Fri Apr 13 15:35:54 2007

Time Stamp Date = On Apr,13 2007 At 15:8:5

The CDB = 28 00 38 8f ad 80 00 00 80 00

Sense Data = 70 00 03 00 00 00 00 0b 00 00 00 00 11 00 00
posted by baggers to Computers & Internet (14 answers total)
 
Perhaps the drive you're trying to replace is slightly larger than the replacement. Unfortunately, a 500GB drive is not always a 500GB drive (even accounting for the manufacturer's use of 1,000,000 bytes to mean a megabyte). There is some variation between models.
posted by wierdo at 4:16 PM on April 13, 2007


wierdo: I just looked up the Maxtor and WD 500GB SATA drives and they all have the exact same number of 'guaranteed sectors' so I'm guessing they are probably exactly the same size as well.

I think your best bet is to try an identical WD drive right now though, RAID controllers can be rather pedantic about things sometimes.
posted by public at 4:24 PM on April 13, 2007


I have experience with LSI MegaRAID cards. Do you know what firmware version you are using?
posted by Industrial PhD at 4:26 PM on April 13, 2007


Response by poster: Wierdo/public, I only have the WD drive that failed, and this generated the same error. I'm reasoanbly certian it's not the drive that is at afault here.

Industrial, firmware version is 713R. I updated it to the latest version I could find online.
posted by baggers at 4:30 PM on April 13, 2007


The company I work for uses the LSI MegaRAID card for some of it's equipment. We have had many versions of the firmware since the 713 series. I think we're on T3@$ or something like that. I can see if I can get a copy of the firmware for you. I do need to make sure that what we are getting is not proprietary to our equipment, as I don't really think you want to hose the rest of your array, as well as the card itself.
posted by Industrial PhD at 4:49 PM on April 13, 2007


Response by poster: Industrial, that would be great. Or if there is somewhere I can download it, that would be fine: 713R is the latest I could find on the LSI site.
posted by baggers at 4:51 PM on April 13, 2007


I've got a build of the firmware in a .rom file. I can send it to you via e-mail or IM or whatever. It is for a PCI MegaRAID card with 6 internal SATA connectors. I'm not sure of the model number for the card itself. I don't believe flashing the firmware would clear it out, but I can't be sure.
posted by Industrial PhD at 5:02 PM on April 13, 2007


Response by poster: Just to update this: I created a new logical drive using the two MAxtor drives (both of which had failed on the rebuild), and it worked fine. So, I am guessing that these drives are fine.

Industrial, email it to me at richard at baggers dawt com, and I'll give it a go. Mu card is a PCI one with 6 SATA conenctors, so it should work. Thanks!
posted by baggers at 5:13 PM on April 13, 2007


Sent.
posted by Industrial PhD at 5:22 PM on April 13, 2007


That CDB entry is a SCSI Read command. If I had to guess, I'd say that it's failing to read a sector on one of the existing disks while trying to rebuild - afterall, there's no reason to do a read from the new virginal disks. You may want to try ddrescue on the individual drives to see if they have errors and if they can be recovered.
posted by leakymem at 5:13 AM on April 14, 2007


I'm with leakymem - it definitely looks like one of your other drives has issues.
posted by polyglot at 5:58 AM on April 14, 2007


Response by poster: leakymem/polyglot - I am not running BSD: I'm running Windows XP. I have run chkdisk and it doesn't report any errors. Is there a windows equivalent of ddrescue?
posted by baggers at 12:36 PM on April 14, 2007


Response by poster: Okay, this is frustrating. so I can't rebuild the array, and an analysis program I tried (R-Studio) says that their are unexpexted MFT errors on the disk, and that I should check the consistency of the disk. But the LSI software won't check the consistency until I rebuild it. Gah.

Any ideas, anyone?
posted by baggers at 4:06 PM on April 14, 2007


Get yourself a bootable linux CD ("LiveCD") and run ddrescue from there. I believe Ubuntu is reasonably easy to use if you're new to linux. Do you know any tame geeks locally that'd be able to help you out?

This Google Search may be of some assistance.
posted by polyglot at 1:01 AM on April 15, 2007


« Older Easiest CMS/blog to get into design-wise?   |   Things to do in Albany New York Newer »
This thread is closed to new comments.