how to clone failing disk in RAID 0 array
July 6, 2007 12:44 PM   Subscribe

One of the disks in my hardware RAID 0 system drive array is failing, but it's still booting for the moment. Is there any tool I can use to clone just the failing physical disk to another identical one?

Typical drive image software will not work for this because Windows sees the two disks in the RAID 0 array as a single volume.

What I need to do is basically mirror the failing disk exactly onto the replacement. I just can't find a tool that will do this -- all my searching keeps turning up drive image tools that run under windows and will happily clone the *entire* drive, but not just the one failing disk.

Because it is a large drive (each one is 500 MB -- I know, I know; I was going to move to RAID 5 shortly, and just hadn't had the time :/) I would like to avoid having to go out and get a 1TB drive to clone the entire disk, not just for expense but because I am afraid that the amount of time it will take to clone everything will increase the chances the disk will die for good somewhere in there.

Any suggestions?
posted by nnovik to Computers & Internet (9 answers total) 1 user marked this as a favorite
 
Boot off another drive, remove the failing drive from the raid controler, clone it, put the clone back in the raid controller, hope for the best.

But that's just guessing.

Novell's docs on something RAID-related aren't much more helpful:

"A Segment Fails in a RAID 0
If a segment fails in a RAID 0, you must delete the software RAID 0 device, create a new RAID 0 device, then copy your data to the RAID from backup media. For information, see Section 9.10, Deleting a Software RAID Device."

If it's a windows box, you might be able to back up everything via an online tool like Mozy, then pull the bad disk, replace it, reinstall windows and then rebuild from backup.

But in the end, there's probably no easy solution, as RAID 0 is not about fault tolerence.
posted by GuyZero at 1:07 PM on July 6, 2007


Any number of drive cloning tools exist that will make exact sector copies. Acronis True Image or Symantec Ghost are often used by Windows users, but many freeware options exist in the UNIX world, such as dd command line tool, which you can run off a bootable Ubuntu or Knoppix CD. You make a bootable disk or CD, disconnect your failing drive from the RAID array, and put it on one channel of a basic IDE or SATA disk controller (pull your RAID controller if its not motherboard based - if it is, you'll have to break the RAID in BIOS, after powering down and disconnecting drives). Plug in a second drive on that same controller, and boot off the bootable media you made that contains the clone tool. Use the clone tool to clone. Substitute the clone drive in the RAID array.

For this to work however, you have to use identical models of drives, and your RAID controller needs not to have put drive markers in the MBR of the drives that have been computed from interrogating drive firmware. Some controllers are easy to fool, and others, with better security awareness, are very tough to fool.

RAID 0 as your primary storage is no way to run a PC, as PC grade hardware is nowhere near reliable enough to do this without recurring trouble. And RAID, even RAID 5 or 6, is no substitute for backing up.
posted by paulsc at 1:17 PM on July 6, 2007


nnovik, if you get the same disk, then "dd" in Unix-ish operating systems will copy it, byte for byte.

get an Ubuntu boot disk and turn off your computer.

Take out your good disc (just to make sure it's safe), and put in the new disc. Boot Ubuntu.

Once it's going, run System -> Administration -> GNOME Partition tool, and peek at your discs, and see what the names of the partitions are. You'll see something like "/dev/sda1" or "/dev/hdb1" et c.

The "[sh]d[abcdefg...]" part is the name of the disc. The number is partitions on it. I'm pretty sure you want to copy the entire disc, so don't use partition numbers.

Once you figure out which is the new disc and which is the old, open a Terminal and type something like
sudo dd if=/dev/source of=/dev/destination
posted by cmiller at 1:24 PM on July 6, 2007


Response by poster: paulsc, this isn't my primary storage, this is my system drive, as stated above. My documents drive is separate, and there is nothing on the RAID 0 drive that I can't replace; that is why I risked running it as RAID 0. :) I am simply trying to avoid the pain of having to reinstall my OS and applications, given that it looks like I do have a chance to back up the volume before it dies.

Many thanks to all of you for the suggestions; will try using dd to clone the disk and report back here on how it goes!
posted by nnovik at 1:59 PM on July 6, 2007


dd will work fine, but it will fail entertainingly slowly if there are any read errors. I've previously recommended ddrescue which has a much smarter error-avoidance algorithm.
posted by Skorgu at 2:18 PM on July 6, 2007


Yeah, ubuntu and dd are good suggestions; that's how I would do it. I didn't know about ddrescue, that's a good pointer.

Don't let them hassle you too much... you knew it was risky and you made sure you won't lose data you really care about. So thumb your nose and try to save yourself some effort on the rebuild. :)

Just for future information: RAID 0 won't usually give you that much of a performance boost. You'll see some improvement with really big files -- and the fact that you're putting two 500g drives in your system tends to imply that you are indeed working with very big files -- but it's usually no more than about 20%. For routine things, it matters not a whit, and it makes it much more likely that you'll lose the combined volume.

Also note: once you've lost one drive of a given type and size, the chances of losing more is much, much higher. Google, who knows more about drive failures than probably anyone on the planet (including the drive manufacturers) says that they go in clumps... the bad units tend to be clustered. Be extra careful with backups as long as that other drive is still in the array.
posted by Malor at 2:46 PM on July 6, 2007


If using dd make sure you use these options:

dd if = /dev/source of=/dev/destination bs=1k conv=sync,noerr

You need the sync,noerr in there or you will truncate on read errors and/or have misaligned data in the output.

ddrescue, already mentioned, will be faster.
posted by TravellingDen at 5:53 PM on July 6, 2007


Malor:

That tidbit about Google's experience with disk is very interesting. You have linkage, perhaps? This is not a "cite your assertion" snark, I'm actually interested in reading more about it. Thanks in advance.
posted by ZakDaddy at 11:38 PM on July 6, 2007


ZakDaddy: Slashdot's writeup seems okay. They link to the original PDF, so you can look at Google's conclusions yourself. I'm pretty sure Ars Technica did a better writeup on conclusions, but it's not coming up in a real quick search.

Google tested a hundred thousand drives for the study; there's probably no other company in the world that could have done it. I believe only Google has that many hard drives in well-controlled and monitored data centers, as opposed to employee desktops.
posted by Malor at 5:09 AM on July 7, 2007


« Older A Per-cyst-ant Problem   |   Whole Paycheck mystery Newer »
This thread is closed to new comments.