RAID Madness! (Well not really)
February 4, 2011 11:54 AM   Subscribe

Please help me get my RAID 5 disks organized. I have two RAID 5 arrays that I feel I could make better use of.

I have two of these drive enclosures, each populated with four 2TB disks (WD Green Power). Both are connected to an iMac via FireWire 800.

The first array, Bender, is the main media storage for my machine, mostly photographs, TV shows and movies, music. Bender also includes backup files from my internal hard drive, which includes virtual machines, Aperture library, and other documents.

The second array, Flexo, is a mirror of Bender. I've written very simple rsync scripts to keep top-level folders in sync. I use nothing directly off of Flexo, it's there in case of Bender failure. I do use offsite storage for backup, plus a FW800 portable disk for important files (Aperture library and photographs, documents...), also kept in sync with rsync scripts.

My question is, this seems like a tedious method to keep everything in sync. Is there an easier way? Cron, or something else?

Another thing that bothers me: The RAID arrays seem to throw a disk every now and then, seemingly because of a bad disk. I take it out, reformat and put it back in, and let the array rebuild again, and the disk seems fine. What's the deal there? I read somewhere that Green Power disks are not designed to be in a RAID array because they take themselves out of the array when there is a problem.

Ideally, I am looking for a low-hassle way to store my media, and have some redundancy for a bad disk or two. I've looked at Drobo units, but the bad reviews scare me. I do love the idea of adding disks on the fly, though. I'm using FW800 now, would a gigabit ethernet connection be faster?

Thanks for reading the multiple questions, any help you can offer is much appreciated.
posted by santaliqueur to Computers & Internet (9 answers total) 4 users marked this as a favorite
 
Sure, you can use cron. You'll have to set up one cron job to sync your imac's files to Bender as backup, and another (probably an hour or two later) to sync Bender (the whole volume?) with Flexo.

Rsync with -a (and maybe -z) is what you want. (The first time you run it, add in -v and/or -p to track progress.) Also set up a "rsync includes" text file that you can point to with the --include flag to only keep the directories you want in sync.
posted by supercres at 12:14 PM on February 4, 2011


Response by poster: This is what I've been running manually, multiple times for multiple folders.

rsync -vur --delete -v /Volumes/Bender/Audio/ /Volumes/Flexo/Audio
posted by santaliqueur at 12:22 PM on February 4, 2011


The RAID arrays seem to throw a disk every now and then, seemingly because of a bad disk. I take it out, reformat and put it back in, and let the array rebuild again, and the disk seems fine. What's the deal there? I read somewhere that Green Power disks are not designed to be in a RAID array because they take themselves out of the array when there is a problem.

Desktop-class drives are designed with the assumption that it will be flying solo, that is, that there is no redundancy. Therefore, if the disk can't read a sector, it will initiate "heroic" error recovery procedures to attempt to get the information. This can take a long time (30+ seconds) during which the drive will frequently be unresponsive to commands. Many RAID controllers interpret this as a bad drive or a drive removal, and remove it from the array. When you re-format, the bad sector is re-mapped (if it wasn't already re-mapped at the conclusion of error recovery.)

Higher end enterprise drives tend to give up sooner because it is assumed that they'll be used in a RAID where if one drive can't read the info, it's faster to rebuild the sector from the RAID than to initiate heroic error recovery. These drives are often configured to throw an unrecovered read error after 2-5 seconds, which the RAID controller can easily recover from.

Your recourse is a bit limited since it's a design assumption, but you could try:
1. Does your RAID controller have a means of changing the timeout value?
2. Some desktop-class drives support limiting the error recovery time through special commands. These can be accessed through software tools, if you're inclined, although it appears that WD Green Power drives don't support it Upgrading to a Barracuda ES (Seagate) or WD RE (RAID Edition) drive would allow you to do this. Probably not something you want to do with so much $$ already invested in drives...
posted by jdwhite at 12:50 PM on February 4, 2011 [1 favorite]


Response by poster: Interesting jdwhite, I think that is what I read elsewhere, but I could remember nothing about the details. It sounds like that is my problem, since each disk seems to be non-faulty when I reformat/rebuild.

I do have quite a bit of money invested in drives (for a home/office setup). Next time upgrade time comes around, I'll likely skip the Green Power drives. They were a good idea in theory, but are now presenting a headache.
posted by santaliqueur at 1:21 PM on February 4, 2011


A few issues here...
  • Yes, you can use cron to schedule an rsync backup. This is bad. Say 20% of your data goes corrupt due to bit rot or user error or whatever and you don't notice for a week or even a few months. Too bad, you already pushed that bad data over to your backup. Instead of rsync, which can do snapshots, think of something along the lines of rdiff.
  • When I tried to rsync data over with the -z option, it was slower than without because the compression required far too much CPU of data that was already compressed (e.g., JPEGs and M4Vs, etc). I was actually better off sending the data without the compression option for rsync and then I hit the network limit of 1 gigabit/second. You may want to experiment here. If your CPUs can keep up with compressing 125 MB/second wroth of data, then even if you gain 2% more throughput, then using -z will work to your advantage.
  • The WD EARS (aka green) drives are reported by some to cause issues. In addition to what jdwhite said, this is due to the constant sleep/wake up (or more park/unpark, I think, of the drive head). This parameter can be tuned in some models, but it's not fun. Others claim no problem with properly tuned drives others claim no problem using them straight out of the box. YMMV, but when I threw together my Linux RAID server for my MacBooks and Mac Mini, I went with Samsung Spinpoints and have had no such failures at all in almost a month. I don't want to kill a drive in 3 months because it went through too many sleep/wake up cycles because it wasn't designed to be used in a RAID array.
  • Have you run a SMART test on them yet? My system runs SMART every other day and lets me know if there's a problem.
  • You scrub your arrays to proactively check for corruption, right? On my system, mdadm by default checks monthly, though when I first set things up, I ran three checks the first week, just to see how it would go and had no problems. I also have a RAID 6 array, so I have more parity votes than you. Not sure if scrubbing a RAID 5 array works as well since you only have one parity vote and I'm not sure if it can reconstruct based on that alone.
  • I, too, liked the idea of expandable, reliable storage for large media collections. I'm super into photography and each 35mm scan of Provia takes up 125 MB and the GF does weddings on the side. Plus all our CDs and DVDs and backups of all the computers. I had an original Drobo, hated it, lusted after a QNAP, then a synology, then realized they all ran Linux anyhow and I know how to handle any issues that would arise so I just built my own. If you're cool with a command line (you might be since you've at least heard of rsync) it's fairly easy to set up a small linux box for less than a QNAP and you can make it do other neat things to, this might be an option for you.
  • As mentioned above, I've got a RAID 6 array, that means I can have two drives die on me at the same time. Or phrased another way, I can replace a drive to grow the array and while it's growing, still have one drive fail on me. You've got a RAID 5 array, so you can only have one drive fail, or if growing your array, no drives can fail. Otherwise, you lose data. Not to mention if something goes south silently and you don't notice it right away, you've already rsynced it to your backup.

posted by Brian Puccio at 1:27 PM on February 4, 2011 [1 favorite]


Response by poster: When I was looking into backup scripts, I came across rdiff, but used rsync because there was an option to delete files that no longer existed on the source. Perhaps rdiff has this option as well, but for some reason, I chose rsync. I'll have to look into rdiff.

I'm sure my cpu can keep up (if any cpus can), I have a current i7 2.8 GHz.

Regarding SMART status, I couldn't find a utility that would monitor external drives.

I thought scrubbing arrays with mdadm was for software RAID. The RAID is done in the enclosure itself, OS X sees only one disk. Do I need to still periodically scrub?

I think what I really need is an 8+ bay enclosure with a RAID 6 setup. My data is sitting comfortably on the 6 TB array, though the data that I'd really hate to lose is less than 500 GB (photographs, Aperture library, VMs and documents), and is all backed up in multiple places.

I think I might have to look into assembling my own box. While I'm relatively new to the tools mentioned here, I'm very comfortable with the command line. I'm limited to the connectivity of the iMac (FW800), but I think I may have to change my entire setup. Thanks for the replies so far.
posted by santaliqueur at 3:15 PM on February 4, 2011


Is the only reason you're rsyncing them to keep them mirrored? Could you use Disk Utility and designate Bender as the initial pane of a two pane RAID 1 array, and let Flexo be the alternate mirror? Then you have RAID 5+1 and you could sustain (theoretically) a 2 drive failure across two distinct raid 5 arrays.

Otherwise, if you just want to keep them synced on a daily basis for recovery's sake (oh crap didn't mean to delete that), there's the always wonderful Carbon Copy Cloner, which is more or less a nice front end to rsync. I have CCC set to automatically clone my boot disk when I plug the drive in to my iMac, then I have a cron job mount it every night with diskutil, and a post cloning script that unmounts the disk so it's not online and available for corruption/wayward processes/etc. I wake up in the morning and see the CCC report that says it backed up 4gigs, and I know it's working. All the info is in CCC's support site, but memail me if you want further information on it. You could do slices of disks and folder specific backups this way too I suspect, although I've never done.

Ultimately my solution for much the same problems ended up being to build a Linux box and RAIDed the disks with mdadm. Ubuntu 10.04's netatalk speaks the right bits for Time Machine right out of the box, and extending avahi to make the Linux box look like another Mac on the network is pretty easy as well. It's marginally slower than my directly connected FW800 disk, but not by much. OSX seems to handle remote disks spinning down and going offline a lot better than it handles local disks spinning down, too. And the price is right if you're comfortable with command line tools. This is a good primer - although it's focused on Time Machine, if the network drive is available for that, it'll be available for most everything else you can throw at it: http://ncatarino.net/archives/592
posted by Kyol at 5:49 PM on February 4, 2011


When I was looking into backup scripts, I came across rdiff, but used rsync because there was an option to delete files that no longer existed on the source. Perhaps rdiff has this option as well, but for some reason, I chose rsync. I'll have to look into rdiff.
rsync is pretty straight forward and is the best way to make sure one set of data always matches another without recopying the entire set again. The only problem is that you can't roll back to any point in time. That's not a fault of rsync.

Use the --remove-older-than option with rdiff-backup to prune older backups to make sure you don't run out of room.
I have a current i7 2.8 GHz.
Nice. C2D here still and probably for another year.
I thought scrubbing arrays with mdadm was for software RAID. The RAID is done in the enclosure itself, OS X sees only one disk. Do I need to still periodically scrub?
Hardware RAID controllers also offer scrubbing. E.g., Dell PERC scrubbing.

Whether or not there's an actual RAID card in there running hardware RAID or you just have a Linux appliance (like many routers and even Synology which also uses mdadm) I don't know, but it should be able to scrub the array.
I think what I really need is an 8+ bay enclosure with a RAID 6 setup. My data is sitting comfortably on the 6 TB array, though the data that I'd really hate to lose is less than 500 GB (photographs, Aperture library, VMs and documents), and is all backed up in multiple places.

I think I might have to look into assembling my own box. While I'm relatively new to the tools mentioned here, I'm very comfortable with the command line. I'm limited to the connectivity of the iMac (FW800), but I think I may have to change my entire setup. Thanks for the replies so far.
I'm the same way. I really do not want to lose 2 TB of data. Ever. It's backed up several different ways, including two off-site. We lost a bunch of photos about two years ago and since then I've had a (healthy?) paranoia of data loss. The rest of it could be re-ripped, etc. But

If you're comfortable rolling your own, you'll save a bit. I spent I think $600 on a machine that would be about as fast as, about the same size as and offer me one more drive bay than a Synology DS1511+, so that's about $400 less or 40% off. I've got a Lian-Li PC-Q08B (got rid of the blue LED in the front) holding 5x Samsun 2 TB Spinpoint drives hooked up to a Gigabyte GA-H55N-USB3 with 4 GB of RAM and a Clarkdale 3.06GHz i3 CPU. Since the motherboard doesn't have SATA ports, I also got a Rosewill RC-218 PCI Express SATA controller.

If you want to know more (down to the mdadm commands and everything else), I'll drop you a MeMail once I write it all up as a howto somewhere.
posted by Brian Puccio at 6:27 PM on February 4, 2011


Is the only reason you're rsyncing them to keep them mirrored? Could you use Disk Utility and designate Bender as the initial pane of a two pane RAID 1 array, and let Flexo be the alternate mirror? Then you have RAID 5+1 and you could sustain (theoretically) a 2 drive failure across two distinct raid 5 arrays.

Otherwise, if you just want to keep them synced on a daily basis for recovery's sake (oh crap didn't mean to delete that), there's the always wonderful Carbon Copy Cloner, which is more or less a nice front end to rsync. I have CCC set to automatically clone my boot disk when I plug the drive in to my iMac, then I have a cron job mount it every night with diskutil, and a post cloning script that unmounts the disk so it's not online and available for corruption/wayward processes/etc. I wake up in the morning and see the CCC report that says it backed up 4gigs, and I know it's working. All the info is in CCC's support site, but memail me if you want further information on it. You could do slices of disks and folder specific backups this way too, I suspect, although I've never done.

...Ultimately my solution for much the same problems ended up being to build a Linux box and RAIDed the disks with mdadm. Ubuntu 10.04's netatalk speaks the right bits for Time Machine right out of the box, and extending avahi to make the Linux box look like another Mac on the network is pretty easy as well. It's marginally slower than my directly connected FW800 disk, but not by much. And the price is right if you're comfortable with command line tools. This is a good primer - although it's focused on Time Machine, if the network drive is available for that, it'll be available for most everything else you can throw at it: http://ncatarino.net/archives/592
posted by Kyol at 7:16 PM on February 4, 2011 [1 favorite]


« Older Help me document my big day for cheap!   |   What do athletes put under braces (tennis elbow... Newer »
This thread is closed to new comments.