reliable RAID on SATA PCI card?
March 21, 2007 11:40 AM Subscribe
AcronymFilter: SATA RAID on PCI. I had one, now it's broken.
I've been needing more space for all my layered photoshop files, so I picked up a couple of Seagate SATA 300G drives. I already have a couple of 300s on the motherboard SATA ports, so I purchased a Belkin PCI card to give me two more ports. (while I was at it, I bought a case with a lot more power, so that's not the problem.) This setup worked flawlessly for a little over two weeks - heck, the RAID worked before I even ran the driver install.
Two nights ago, I left the computer running some Photoshop scripts (layered, masked sharpening) on giant stitched files, and woke up to the computer off, and refusing to reboot. After the safe mode prompt, the list of system files being loaded stopped at mup.sys, and if I tried to restart windows normally, it would start to fade into the Windows XP splash screen with progress bar, then hang.
Not having time for bullshit yesterday morning, I poked around during work and found something online that suggested I remove my wireless mouse dongle. When I got home, I went into BIOS, and all my IDE drives except the boot drive had been un-recognized, so I got those back, and booted the machine. The belkin RAID text-screen then told me that one of the drives was offline, so my RAID was incomplete. I continued, and pulled the mouse dongle after the safe mode selector screen, and after a minute or two, the logon screen showed up. I quickly put in a PS2 mouse, and was able to run windows normally.
The raid log says one of the two drives took a nap at 1:30 in the am (about what time I expected the scripts to end). I turned off the machine, removed the PCI RAID control from the slot, and the damn thing works as it did two weeks ago, before I added the drives. Can anyone suggest either 1) a way to make the SATA card I've got work, or 2) a PCI RAID SATA card that won't send me into conniptions?
I've been needing more space for all my layered photoshop files, so I picked up a couple of Seagate SATA 300G drives. I already have a couple of 300s on the motherboard SATA ports, so I purchased a Belkin PCI card to give me two more ports. (while I was at it, I bought a case with a lot more power, so that's not the problem.) This setup worked flawlessly for a little over two weeks - heck, the RAID worked before I even ran the driver install.
Two nights ago, I left the computer running some Photoshop scripts (layered, masked sharpening) on giant stitched files, and woke up to the computer off, and refusing to reboot. After the safe mode prompt, the list of system files being loaded stopped at mup.sys, and if I tried to restart windows normally, it would start to fade into the Windows XP splash screen with progress bar, then hang.
Not having time for bullshit yesterday morning, I poked around during work and found something online that suggested I remove my wireless mouse dongle. When I got home, I went into BIOS, and all my IDE drives except the boot drive had been un-recognized, so I got those back, and booted the machine. The belkin RAID text-screen then told me that one of the drives was offline, so my RAID was incomplete. I continued, and pulled the mouse dongle after the safe mode selector screen, and after a minute or two, the logon screen showed up. I quickly put in a PS2 mouse, and was able to run windows normally.
The raid log says one of the two drives took a nap at 1:30 in the am (about what time I expected the scripts to end). I turned off the machine, removed the PCI RAID control from the slot, and the damn thing works as it did two weeks ago, before I added the drives. Can anyone suggest either 1) a way to make the SATA card I've got work, or 2) a PCI RAID SATA card that won't send me into conniptions?
Response by poster: I do understand RAID. I had it set up as a RAID 1. The only ones available were 0 and 1. My big problem is that one of the two drives is missing. It just occured to me that I ought to switch the cables to determine if it's a port on the card, or one of the physical drives.
Except, no matter what, whenever that card is in, I have to play stupid USB mouse tricks any time I want to boot the machine into Windows.
Other data: in both the text-based (bootup)and Windows utilities, one of the two drives doesn't exist. Yes, I've checked that they're both getting power.
posted by notsnot at 1:15 PM on March 21, 2007
Except, no matter what, whenever that card is in, I have to play stupid USB mouse tricks any time I want to boot the machine into Windows.
Other data: in both the text-based (bootup)and Windows utilities, one of the two drives doesn't exist. Yes, I've checked that they're both getting power.
posted by notsnot at 1:15 PM on March 21, 2007
No, what you have is a bad drive.
MUP.sys error usually indicates that one of the system drives is bad. Nothing about your mouse dongle or whatever... unlike Mac or Linux, Windows expects you to figure out drive problems on your own and the system just hangs.
If you can put the PCI card back in, and boot to safe mode, run chkdsk (or whatever the application is) on all the disks, you might be able to repair it long enough to get the info off the drives.
Did you have the other two 300gb drives configured as a 600GB Raid0 partition, or one 300mb Raid1 partition? Sounds like RAID0 to me, which would mean the stripe (and therefore your data) may be hosed.
Don't blame it on the Belkin, it did what it's supposed to do. The drives were bad.
posted by SpecialK at 1:17 PM on March 21, 2007
MUP.sys error usually indicates that one of the system drives is bad. Nothing about your mouse dongle or whatever... unlike Mac or Linux, Windows expects you to figure out drive problems on your own and the system just hangs.
If you can put the PCI card back in, and boot to safe mode, run chkdsk (or whatever the application is) on all the disks, you might be able to repair it long enough to get the info off the drives.
Did you have the other two 300gb drives configured as a 600GB Raid0 partition, or one 300mb Raid1 partition? Sounds like RAID0 to me, which would mean the stripe (and therefore your data) may be hosed.
Don't blame it on the Belkin, it did what it's supposed to do. The drives were bad.
posted by SpecialK at 1:17 PM on March 21, 2007
Oh, misread the mouse dongle thing. Yes, WinXP safe mode hangs sometimes when you have USB peripherals that are essential to the system, like a mouse, with 3rd party drivers.
God, windows sucks. (Haven't had to troubleshoot it for years.)
posted by SpecialK at 1:21 PM on March 21, 2007
God, windows sucks. (Haven't had to troubleshoot it for years.)
posted by SpecialK at 1:21 PM on March 21, 2007
Response by poster: Both pairs of 300s are RAID 1 - really, I just need a lot of reliable storage, not really fast storage. RAID 0, to me, sounds like an oxmoronic (not redundant!) way to be fast and double your chances at a catstrophic failure.
The remaining "new" drive shows up in Windows, no problem. I even copied a few things off of it while I could.
One of the two new drives - connected through the card - does not show up anywhere. I guess I can try to put the new drives on the motherboard ports, and run checkdisk in that configuration.
posted by notsnot at 1:37 PM on March 21, 2007
The remaining "new" drive shows up in Windows, no problem. I even copied a few things off of it while I could.
One of the two new drives - connected through the card - does not show up anywhere. I guess I can try to put the new drives on the motherboard ports, and run checkdisk in that configuration.
posted by notsnot at 1:37 PM on March 21, 2007
Best answer: That 2nd new drive (that's not working) might've had a catastrophic failure and corrupted part of the other one or the main system. The good thing is that it'll still be under warranty. And you didn't lose any data.
This is why RAID1 is a good idea.
Be aware that if you connect the working new drive to your motherboard, it might not work at all. Many low-end raid controllers are proprietary as far as how they store the data on the hard drive.
posted by SpecialK at 1:49 PM on March 21, 2007
This is why RAID1 is a good idea.
Be aware that if you connect the working new drive to your motherboard, it might not work at all. Many low-end raid controllers are proprietary as far as how they store the data on the hard drive.
posted by SpecialK at 1:49 PM on March 21, 2007
I think you're going down the right path in putting the drives onto an interface (the motherboard) that's known-good, so that you can see if it's the drive that's the problem, or the RAID card.
I'm with SpecialK in suspecting, based purely on past experience, that it's the drive that went bad, rather than the RAID card. It's not really uncommon for drives to fail very quickly after they're installed (IIRC, people have done tests and determined that drive failures follow a 'saddle curve' where they either fail when new, or a long time down the road) because of manufacturing defects. It was just a "bad apple."
So I'd put the bad drive onto a SATA port that you know is working, and see if it works. If it does, then you need to get a new RAID card, potentially.
If, as I suspect, it's still bad, then you need to contact Seagate and get a replacement drive, and then you can hook that up to your RAID card and re-RAID the drives. In that case, you should really be thanking the RAID card rather than cursing it, since it saved your data!
Anyway, be interested to hear how it all works out.
posted by Kadin2048 at 1:51 PM on March 21, 2007
I'm with SpecialK in suspecting, based purely on past experience, that it's the drive that went bad, rather than the RAID card. It's not really uncommon for drives to fail very quickly after they're installed (IIRC, people have done tests and determined that drive failures follow a 'saddle curve' where they either fail when new, or a long time down the road) because of manufacturing defects. It was just a "bad apple."
So I'd put the bad drive onto a SATA port that you know is working, and see if it works. If it does, then you need to get a new RAID card, potentially.
If, as I suspect, it's still bad, then you need to contact Seagate and get a replacement drive, and then you can hook that up to your RAID card and re-RAID the drives. In that case, you should really be thanking the RAID card rather than cursing it, since it saved your data!
Anyway, be interested to hear how it all works out.
posted by Kadin2048 at 1:51 PM on March 21, 2007
I'm sorry but the words Belkin and RAID together don't conjure an image of reliability for me. Do you have onboard RAID1? That would be more reliable than the Belkin card.
And PCI RAID with SATA drives kinda makes me wonder, whats the point? IIRC, PCI maxes out at 133Mbit/s. (You should be using PCI-X or PCIE). Basic SATA is 1.5Gbit/s.
I think you should forgo the trouble of RAID and use one disk as a backup drive.
posted by mphuie at 3:10 PM on March 21, 2007
And PCI RAID with SATA drives kinda makes me wonder, whats the point? IIRC, PCI maxes out at 133Mbit/s. (You should be using PCI-X or PCIE). Basic SATA is 1.5Gbit/s.
I think you should forgo the trouble of RAID and use one disk as a backup drive.
posted by mphuie at 3:10 PM on March 21, 2007
Have you considered software RAID? I have both hardware RAID (Adaptec RAID-5) and software (Windows RAID-1) and over the past five years both have experienced multiple disk failures and/or power outages and both have recovered quite well. The software RAID runs on Firewire and is reasonably fast - I'm sure that a modern CPU and SATA would provide excellent performance.
posted by meehawl at 4:37 PM on March 21, 2007
posted by meehawl at 4:37 PM on March 21, 2007
Response by poster: I hooked the new drives up to the mobo SATA jacks, and one is still dead. I hooked the known-good old drives up to the PCI card, and they both show up. I guess a drive did die.
I've considered software RAID setups, but the card happened to come with the RAID setup, so I ran with it.
posted by notsnot at 8:58 PM on March 21, 2007
I've considered software RAID setups, but the card happened to come with the RAID setup, so I ran with it.
posted by notsnot at 8:58 PM on March 21, 2007
This thread is closed to new comments.
If you hooked up the drives as a RAID 0 array, for performance reasons, you were alternately striping data first to one drive, then the other. The failure of any drive in a RAID 0 set breaks the set, and data is unrecoverable. RAID 0 is never worth using for data storage, since it cuts reliability significantly. And if you are using 32 bit Windows XP, you can't get the performance benefits of even RAID 0, due to internal limitations of Windows storage drivers.
If you hooked up the drives and put them in RAID 1, you should be able to recover all the data you had, as each drive contained a mirror copy of the contents of the other. You just need to put the Belkin card back in, and use its utilities to break the RAID set back to individual volumes, then replace the failed drive, and re-RAID the set.
It's also possible the Belkin card supports compound RAID levels for 2 drive sets, such as RAID 0+1 or RAID 10 (1+0), but if you don't understand RAID, it doesn't seem probable you would have set up those advanced modes.
posted by paulsc at 11:59 AM on March 21, 2007