How to use all of my old hard drives for redundant storage?
December 20, 2010 5:35 PM   Subscribe

I have a lot of old hard drives hanging around and I'd like to glue them together into some sort of redundant storage "scratch drive" I can store things on. The hard drives sizes aren't similar and I'd like to add on in the future. How would you recommend I do this?

After steadily upgrading and migrating my various Windows installations over the years I have a lot of older hard drives hanging around collecting dust. I don't know what condition they're in, so I don't want to commit data to them without having some fault tolerance. But since they're from differing generations of computing hardware they're all different sizes / speeds / interface technologies, etc. I'd like to put them to use.

Ideally as a redundant, network attached storage server and - for bonus points - an iSCSI target.

Things I've researched and tried in one form or another:

Linux RAID

Via FreeNAS. I also came across some work a few years ago (the link escapes me, apologies author!) that involved initializing a section of two drives as a RAID 1, then adding that to an LVM "pool" for use as redundant storage.

Very good work and robust technology, but I'd like to avoid that much heavy lifting with CLI tools if possible. Though a strong contender. I would have to do some initial "ground work" making the volumes to work with FreeNAS correctly. I'm not sure they'd support that sort of configuration anyway (with odd sized drives, combined together in an ad hoc fashion).

I can make it work, but it would require a bit more work than I'd like. Both setting it up and maintaining it.

Windows Home Server

Does exactly what I want, but: It's Windows and I can't use it as an iSCSI target.

It does the volume management with some sort of Shadow Copy voodoo That Just Works (TM). I'm fine with the price tag, support is nice, but I don't control it; Not really. Also I'd love to just have a virtual volume I could use as an iSCSI target for my Hyper-V.

ZFS + Raid-Z

Does what I want...after a fashion. Gives me the virtual volume I want, iSCSI is manageable, Solaris, etc. It's all sorts of cool, this file system. It's not designed to use odd sized volumes like this, but in my testing it does in fact support using odd sized volumes together in a redundant array.

I'll have to learn more to implement it, but there's a handy guide, I can download and use Solaris free of charge (or use BSD) and it has some nifty features.

I also considered Drobo, but didn't include it here due to mixed reviews and cost concerns.

So? Anything I've missed? What would you recommend in this case to achieve my aim?
posted by Pontifex to Computers & Internet (16 answers total) 7 users marked this as a favorite
 
Best answer: I know not much about it, but JBOD (just a bunch of disks, basically all disks, no controller) stuff is interesting. In looking around at some of the things you have already looked at, I found something on Google Code that caught my eye, and that's Greyhole. Might give it a look-see.
posted by deezil at 6:44 PM on December 20, 2010


Freenas is great it worked well for me. I had a Freenas box that I just retired it rand for about 4 years and was still running when I retired it.
posted by jmsta at 6:49 PM on December 20, 2010


Best answer: I'm a little out of my league here, but I came here to mention Greyhole (which deezil did for me:) ). Amahi home server uses it. I believe that Greyhole has the ability to turn your array of odd-sized disks into a mass of storage, and you can tell it how many times to mirror a file ("These HDs are pretty messed up, I can imagine 3 of them failing before I have time to fix up the array. Better create 4 mirrors.")

I have not used Greyhole, and I'm not an expert, but it's what came to mind.
posted by Tehhund at 6:51 PM on December 20, 2010


All of the RAID solutions I'm aware of will only use the same amount of storage on each member drive. You can use different sized drives but you couldn't use, say, 50 GB from one drive and 100 GB from another, but you could use a 50 GB drive and and 50 GB from a larger drive in the same RAID volume.

You can use different sized drives in a JBOD configuration but you lose any redundancy, and worse than that, you may lose all the data if any one drive fails.

Another concern would finding a motherboard with enough IDE ports to support the number of drives you want to use, assuming your older drives are Parallel ATA. The ports are rapidly disappearing, I opened a new Dell tower today and it didn't have any IDE ports, just 4 SATA for the hard drives and the DVD-ROM.
posted by tresbizzare at 6:51 PM on December 20, 2010


Short answer: buy the largest sata drives available at a reasonable price and put them in a cheap enclosure that speaks the format you wish. Don't bother with re-using your old drives.

Short answer you don't want to hear; buy a drobo.

Short answer you might want to hear; Buy a Synology ds410j.

Longer rambling;

I love/hate my drobo (first gen). The user experience with the DroboShare and a Windows or OSX machine or directly connected to a Windows or OSX machine is wonderful. It just works. Upgrading drives is easy. It is unfortunately kinda expensive. And oh my gosh golly is it slow (supposedly fixed with the second generation). Am I happy I have it as my network backup device, yes. Would I recommend it to others; probably not.

If you thought FreeNAS was too much work, certainly don't bother with BSD/Solaris/Linux and ZFS. I'm pretty sure Solaris is no longer free for personal use.

I love FreeNAS (.7.x series, I haven't tried the FreeBSD 8 based rewrite). It makes setting up nfs, cifs, appletalk, and iscsi a snap. Setting up complex raid setups requires understanding the concepts of what is going on. It has zfs in beta, but I have not used it. I would not recommend it if you just want it to work with a pretty interface.
posted by fief at 6:59 PM on December 20, 2010


what about a barebones Linux distro running mdadm?
posted by xbonesgt at 7:14 PM on December 20, 2010


+1 to what fief said. There's a limitation the way the most common RAID file systems work - they can only replicate in chunks the size of the smallest container. So if you've got 80GB, 120GB and 300GB disks lying around, the largest multi-disk redundant array (spread across multiple disks) is going to be 80GB.

On the other hand, if you just want to do JBOD where they're all separate storage devices and all they share is an enclosure, then that's easy. Maybe you can back up the small ones to the larger ones?

But if you want to swap non-matching drives in and out as modules while still retaining data redundancy across physical drives, then you've got to learn some pretty exotic file systems. Or buy a drobo device where the hardware and a proprietary file system will take care of that for you - but for a price.

Enterprise IT administrators (and some people who edit video) don't like the drobo because it's a closed and nonstandard platform that has lower performance scores when tested against the raid setups found in pro-grade servers. But for a home/nontechnical/technical but can't be bothered user that just wants a giant storage bucket that they can pop drives in and out of, and has built-in protection for any one drive failing? That's drobo's market.

tl:dr? If you're willing to buy fresh disks (they're cheap), build your own FreeNAS box. If you want to recycle random drives, spring for a drobo. It's time versus money.
posted by bartleby at 8:16 PM on December 20, 2010


You already mentioned ZFS, which is awesome, so I'm not sure how much I can add beyond that :)
posted by lundman at 9:42 PM on December 20, 2010


The trouble is you'll be putting data onto questionable media. And while the tools to create jbod-like arrays are crude, the tools to recover from them are even worse (if extant at all ). So you'd take all that time to set it up and move files onto it, only to have a drive crap out and make the data impossible to reconstruct.

Then there's the question of power consumption. Spinning up a bunch of small, slow drives is likely to consume a more power than a new 'green' drive. Given than 1TB drives are under $75 it's really a bit foolish to waste the time/money on the old drives. Better to recycle their metal and move on.
posted by wkearney99 at 6:33 AM on December 21, 2010


Response by poster: Good answers all. Apologies for the delay, holiday's kept me from testing. I'll be posting followups today with experimentation results of some of the best answers.

And then marking "best" responses, of course.
posted by Pontifex at 12:35 PM on December 27, 2010


Response by poster: While waiting for Fedora to download I thought I'd respond to a few of the points mentioned here and those that I can't test with a virtual machine + OSS.

@ jmsta, fief
I love FreeNas, it just doesn't support the type of redundancy / features, I'm looking for. It uses Linux RAID, which while nice, doesn't support the sort of odd-sized-stick-random-hard-drive-and-go I'm looking for.

I plan to actually manage the iSCSI target / share with FreeNas for ease of administration if the tool I settle on doesn't have a nice interface. You really can't beat FreeNAS for usability in that field.
posted by Pontifex at 1:49 PM on December 27, 2010


Response by poster: @tresbizzare

"All of the RAID solutions I'm aware of will only use..."

Right, which is why it won't work for what I have in mind. If you look at the technology behind WHS (Windows Home Server): Drive Extender it uses file based replication to provide redundancy while using odd-sized drives to provide storage that mimics JBOD with some RAID functionality.

They use VSS (aka Shadow Copy) to provide this functionality, which is unfortunately being removed in the next major release of WHS. Greyhole (above) actually does the same thing (file based replication), but is open source.

"Another concern would finding a motherboard..."

Good point. I actually address that with controller cards. Which can be a variety of styles but are almost uniformly inexpensive and add ports for more hard drives, at least one of which is IDE. Most also have the ability to add more SATA ports, paving the way for expansion of more modern hard drives in the future, if I fill up the mother board slots that come with the motherboard I have.

I also played around with hard drive "adapters", which are essentially IDE to SATA converters or similar. Though those tend to be: More expensive, lower quality and purchased less than their counterparts of controller cards above.
posted by Pontifex at 2:00 PM on December 27, 2010


Response by poster: @bartleby && wkearney99

"But if you want to swap non-matching drives in and out as modules..."

"The trouble is you'll be putting data onto questionable media. And while the tools to create jbod-like arrays are crude..."

The file based replication / redundancy solutions (WHS + Greyhole) get around having "exotic file systems" or "crude recovery tools", by just copying the files as is; To provide redundancy across sufficient physical drives in the system to mimic the comparable RAID setup. So you can actually just mount the storage drives as they are and read the files normally, they're not stored in any odd fashion, nor do they require any special tools to read the data. The intelligence is in the program itself, not in how the files are stored.

Here's GreyHole's overview of the process, complete with command line programs ran to execute the task: HowGreyholeWorks

Just a taste of a possible task completed with the tools GreyHole uses:

# Client:

mv /mnt/Music/file1 /mnt/Music/file2

# Server (Greyhole daemon):

mv /mnt/hdd1/gh/Music/file1 /mnt/hdd1/gh/Music/file2
rm /shares/Music/file2
ln -s /mnt/hdd1/gh/Music/file2 /shares/Music/file2
mv /mnt/hdd0/gh/Music/file1 /mnt/hdd0/gh/Music/file2
rm /mnt/hdd[0-1]/gh/.gh_graveyard/Music/file1
# Writes metadata about the above in /mnt/hdd[0-1]/gh/.gh_graveyard/Music/file2

To accomplish the redundancy and the move with very simple tools. While writing out the specialized meta-data to specific directories on the volumes only used by GreyHole, nowhere does it modify the actual files in any way.
posted by Pontifex at 2:11 PM on December 27, 2010


Response by poster: "tl:dr? If you're willing to buy fresh disks (they're cheap)..."

I think most of you had this in the back of your mind. I would normally do just that, and recycle the old ones, but even "cheap" is out of my price range at the moment. Times are tight.
posted by Pontifex at 2:12 PM on December 27, 2010


Response by poster: @wkearney99

"Then there's the question of power consumption..."

That is a damn good point, I hadn't considered that.

I'll have to do some math about power consumption differences between the "green" drives you mentioned and doing aggressive spin throttling to minimize power consumption.

Intuitively I'd think that the green drives would pay for themselves over and above many smaller drives spinning in parallel, thus mitigating this whole exercise in terms of cost.

I'll followup here with my findings about the technologies I'm testing at the moment as it's a good exercise and they look pretty cool. But you've definitely given me something to think about.
posted by Pontifex at 2:16 PM on December 27, 2010


Response by poster: Okay! So now that my test server is in the shop, I have some time to report my findings.

Greyhole

I like it very much.

Good documentation, though it could use more, imho. I plan to add some, given my experimentation.

Nice little community. You have to go digging a bit, but the author is very responsive and helpful. Still rather small as the project is fairly young.

Download the Source. Blog.

The configuration utility (called "web-app" in the download section) isn't the cross-platform at the moment, but quite usable once it works.

There's a high degree of activity and new versions of the utility and the web-app are released quite quickly.

It's written entirely in PHP at the moment and works very well for being in a beta stage. I plan to put some work into making it better, also porting it to a bootable-USB style solution for plugging into a "brick" of hard drives in a case to make it an instant greyhole style file server.

4/5 stars, given its level of development.

Amahi

Very polished interface, great web site design, very approachable to the novice.

Unfortunately built on Fedora.

For the life of me I cannot get Fedora to compile in Hyper-V modules for testing, so I had a bit of a rough time of it. Following their documentation was easy, but unfortunately the kernel build process with Fedora is not for the faint of heart; Read: Undocumented errors preventing proper compilation.

I was able to test it briefly on my actual physical test-bed, but I don't like it very much. It's a very good design for its intended audience (novices to Linux who are setting up their first home server and want a work-a-like for the older Windows Home Server functionality). But it's not very good for me, where the Amahi server is one of a few I have on my network. It attempts to be the DHCP server for the network, which makes it awkward for my DHCP / DNS server. Also it insists on being on its own network, while fine for the aforementioned novice audience, is impractical for me. Settings the machine's network setting also simply doesn't work. Amahi apparently hard codes networking information into some configuration file that bypasses the UI control of Fedora, making it impossible to change it to the proper network.

They'll be coming out with a new version in a bit, based on Fedora 14. Personally I think they could benefit to making it more "appliance" like in the sense of a virtual machine and strip out the rest of Fedora and just make it a single interface with more configuration options.

Here's the feedback if you want to put in your 2$ along with mine.

3/5 stars
posted by Pontifex at 8:55 PM on February 7, 2011


« Older And strrreeeetch!   |   Seeking a profession/life where it's all... Newer »
This thread is closed to new comments.