Loooking for large hard drive
April 27, 2008 10:43 AM   Subscribe

I am looking for a large volume external hard drive. 1tb or more. I hear that drives of that size are not yet reliable. I will be digitizing my music collection and want to store it on a reliable drive. Any recommendations or experiences? Also how can I take care of this drive so it will last forever or almost forever?
posted by citybuddha to Computers & Internet (14 answers total) 21 users marked this as a favorite
Setting aside the question of why you need 1TB+ for music:

You don't want a single drive.

You'll want something like a Drobo where you can slap in multiple large drives that will appear to you as a single superlarge drive. Or you want a RAID setup, but those require effort to maintain, and Drobo-style solutions are designed to be really easy to use.

Also: It will not last forever. It will not last almost forever. There's really nothing you can do about that. Hard drives consist of chunks of metal spinning around at thousands of revolutions every minute; sooner or later they all fail. Companies and people who rely on avoiding disaster don't buy special extra-reliable drives; they use things like RAID mirroring and regular backups to be able to recover when the totally inevitable failures happen to their standard-issue drives. Things may change a bit as flash memory gets cheaper, as it lacks moving parts, but that's really not relevant at the kind of size level we're talking about right now.

A drive will last, usually, 3-5 years. Some will die in a year; some will last 8. You should always assume, at any given time, that failure's about to happen, and build your backup/mirroring with that assumption in mind.

Your drive cannot, and will not, last forever. But if you back up regularly, your data will survive.
posted by Tomorrowful at 10:56 AM on April 27, 2008 [2 favorites]

How much music do you have? I just did this with my >40k song collection (hi-rez mp3, 320kbps mostly) and filled about half of a nice LaCie 500 ($200 at guitar center). The 500gig drives are everywhere these days.

As for "forever", it's a pipe dream baby. Your goal should be archiving your current media for about 3-5 years, until the next "stable" digital archiving platform comes along.

I've got a bundle of archived material that's come up from 5-1/4" floppies through 1.4m disks through zip disks through cdr through dvd now just stacks of hard drives, waiting for the permanent diamond-based holographic crystal nano-memory cubes to drop.
posted by Aquaman at 10:58 AM on April 27, 2008 [1 favorite]

Also how can I take care of this drive so it will last forever or almost forever?

You can't. Plan on this drive failing. Buy two, keep them in different locations, and keep them synchronized (e.g. with unison). When one fails, replace it.

Also, a terabyte is a hell of a lot of space for music. If you use lossless compression (e.g., FLAC), that's enough space for roughly 2,500 CDs. Lossy compression like mp3 and it's more like 10,000.
posted by qxntpqbbbqxl at 10:58 AM on April 27, 2008 [1 favorite]

Also how can I take care of this drive so it will last forever or almost forever?

You can't. Hard Drives, even if operating perfectly, will all eventually fail. First, they have moving parts. Moving parts wear out, and these parts can't be easily replaced without a clean room.

Second, even a perfectly operating hard drive will occasionally write out a bad byte of data. It's in the specification documents. (about one out of every 10^14 operations, but don't quote me on that)

The way you ensure your digital data lasts forever is to develop a redundant backup plan and stick to it religiously.
posted by alana at 10:59 AM on April 27, 2008

Also how can I take care of this drive so it will last forever or almost forever?

You really can't -- a hard drive is not a reliable storage mechanism for anything you want to keep permanently. If you don't believe us, you will eventually learn this the hard way.
posted by advil at 11:01 AM on April 27, 2008

One thing you need to make sure of is you're actually getting one 1 TB drive, not 2 500 GB drives striped together. (Which effectively doubles the failure rate, since your data is gone if one of them dies in the case of RAID0, or hard(er) to recover in the case of JBOD).

No drive will last forever. You always need backups (to another 1TB drive, tapes, ...). Most drives die because of frequent power on/offs. So while a good power management will save you energy (and noise), it will also shorten the lifespan of the disk.

As for the connection, I prefer FireWire over USB. It's faster (certainly FW800), uses less cpu and has a smaller performance overhead. The downsite is that it's not as ubiquitous as USB. I haven't used E-SATA connected drives yet, so I can't comment on that. Many of the higher quality/priced external drives have got multiple interfaces anyway.
posted by lodev at 11:03 AM on April 27, 2008

For maximum longevity, buy a pair of USB/FireWire drives and keep the contents synchronized. Consider keeping the backup off-site to further the odds of retaining your data. Every HD will eventually fail. I had a Seagate 300GB HD which housed my iTunes library take a dump right before Christmas, but as the data was mirrored on a separate drive, no worries.

I've got too many optical discs in my house and with the low price of huge discs, I'm moving to a solely hard drive based storage solution instead. I have about half a terabyte of HD movies that I synch up when I add new content.
posted by porn in the woods at 11:04 AM on April 27, 2008

As everyone else said, hard drives will fail. Expect it, and keep multiple copies (different discs, different locations) of anything you want to last.
Something fancy like Vice Versa is great for this. But you can do it for free using your OS's scheduler and a few batch files.
posted by jockc at 11:08 AM on April 27, 2008

I have one of these and have been happy with it. Maybe out of your price range but there are similar solutions that may be cheaper.
posted by Justin Case at 11:17 AM on April 27, 2008

The first thing that kills disks prematurely is heat. The next thing that kills them is vibration within loose mounts. Finally, most external disk boxes have cheap fans and $2 power supplies that will tend to die way before the disks they contain.

Everything everyone tells you about multiple redundancy is true. Think about how many hours it will take you to rip, tag, and index your collection. Think of a second drive as a fixed insurance policy.

Another thing to consider when you are "digitizing" your music collection is to avoid using a lossy format (MP3, AAC, WMA) as your primary digitisation. Instead, choose a non-lossy format such as FLAC (there are many to choose from, including an Apple-blessed format), and budget your space accordingly (compressed, lossless formats take up ~2x the space of 320 Kbit/s MP3). That way you have a repository of "pristine" source files from which you can easily output lossy formats in a bitrate or quality level suitable for your destination players. You can do this in a batch process that takes minimal user time or, with some players such as MC, the files get downsampled during playback depending on the capacity of the pipe and the ability of the player. Sometime in the future when you upgrade your players or your speakers, if you had all lossy formats then you could start hearing compression artifacts and re-ripping to a higher quality level would suck (I say this as someone who had to re-rip all from CD all the MP3s I made during the mid-90s at high bitrates with what was then state of the art, and found it an extraordinary pain in the arse). However, with lossless files, batch-creating a higher quality set is trivial.

But get two disks! Either running in parallel as RAID, or with a one that is used for regular, offline backups (and preferably located distant from the main disk). Purists will tell you that RAID is not backup, and that's true, but it's better than nothing. But think about a non-local disk backup!
posted by meehawl at 11:25 AM on April 27, 2008

> Also how can I take care of this drive so it will last forever or almost forever?

You can't.

If you use the drive, eventually its moving parts will wear out and it will fail. If you don't use the drive, eventually the lubricants will dry up and it will fail. It's possible that the latter will take a long time* for a well-constructed drive, but it would be difficult to predict exactly how long. (The drives haven't been around that long, so all you have to go on are MTBF figures that use accelerated aging, and they're not particularly reliable and really are only helpful if you have lots of drives and want to predict the approximate replacement rate you'll need. They're great if you're Amazon or the DoD, less helpful if you're Joe User with a couple of drives.)

It would be a grave mistake, IMHO, for you to spend a lot of time digitizing your music collection and only store it on one drive. At the very least I would buy two drives, preferably from different manufacturers, and at store it in two places. Get quality drives -- and if they're externals, make sure they're in quality enclosures with good cooling and circuitry that allows idle spin-downs -- cross your fingers, hope for the best, and have a backup strategy.

To be really blunt, I'd say that if you can't afford to buy two 1TB drives, you should either hold off on the project until you can (if it really requires 1TB), or instead buy 2 500GB drives and do half the project, storing it in two places, and then once you fill that, buy 2 more. 1TB is a lot of data, and that represents a lot of invested time and effort; I would never fill a single drive with that much stuff that's not backed-up elsewhere. It's an awfully big 'basket' to have all your eggs in, which means when it inevitably fails, you'll just lose that much more stuff.

What you want to do depends on your level of comfort and how much money you have to spend. Personally with storage being as inexpensive as it is, I would never have anything that represents more than a trivial amount of effort not backed up at least once. (And anything that's more than a day's worth of work in aggregate I try to keep in two places.) But everyone is going to have a different level of comfort depending mostly on how much you value your time.

* I have hard drives from the early 1990s that still work, or at least spin up, but I've thrown out many over the years that have failed prematurely. I've also seen a common situation where a drive that's sat around for a long time will work for a short time after it's plugged back in, but will then quickly fail.

I'm not sure if anyone has ever studied how long an unplugged, offline drive will remain viable, but even if you do find such statistics, they'll really only be useful in aggregate: they don't give you much insight into "your" drive in particular. (In the same way that knowing the half-life of an element doesn't tell you when a particular atom is going to split, knowing an MTBF doesn't tell you when a particular unit is going to give up the ghost.)

posted by Kadin2048 at 11:27 AM on April 27, 2008

Mark Pilgrim's post on long-term backup -- or rather, the discussion in comments -- is worthwhile reading here.

The variables here are price, reliability and size. You can't get large, reliable storage on the cheap, either in terms of hardware or labour. It becomes a case of compromises, and that means having redundancy and a backup strategy. Plan for failure, budget for failure, and don't throw away your CDs.
posted by holgate at 11:53 AM on April 27, 2008

Don't just use RAID and assume that's that; there's plenty of failure modes that'll kill a RAID-1 just as easily as a single drive. All RAID-1 will get you in such cases is your data loss mirrored across two drives.

For my important data, I RAID-1 for availability; if one drive dies, oh well, it's not got in the way of my ability to do anything. I then make incremental backups of that to a third drive, complete with par2 recovery records, so if both die, or if I manage to do something stupid and delete something, or my filesystem gets eaten, I still have a means of recovery.

On top of that, I actually have two third drives, which I rotate once a month, keeping one offline. Hot-swap bays make this fairly painless, and if the worst happens and it all explodes, the offline disk has at least a chance of surviving (assuming the rest of the house does). Storing it off-site would probably be a good plan, but I just keep it in another cupboard.

As for drive reliability; stop worrying about it. Spending an extra £60 on an enterprisey "nearline-storage" drive might net you a more reliable drive in theory, but it's still going to die horribly at some point, so you need to be able to cope with it in either case, whether it's a brand new 1T model or a tried and tested 36G SCSI drive, or even a fancy solid state thing.
posted by Freaky at 1:11 PM on April 27, 2008 [1 favorite]

Drive reliability works like this:

Let's call the probability that a given drive will die on any given day D. D will, in general, be a very small number, since drives are pretty reliable; it tends to be larger in the first few weeks after the drive is first deployed, very very low for most of the drive's life, and larger again as the drive wears out after a few years see "bathtub curve"). For the sake of argument, I'm going to pull a gross over-estimate of D out of my arse and assume it's a constant 0.001 (or 0.1%, if you prefer). It follows, then, that the chance of a given drive not failing on any given day is (1-D) or 99.9%. A drive with this level of reliability would have about a 70% chance of lasting a whole year: (1-D)365.

Now, if you're running two drives with no redundancy (RAID-0 or JBOD configuration) to gain increased disk capacity, the chance of both of them surviving any given day is (1-D)x(1-D). That makes the chance that one of them will fail, which is all that's required for you to have a nasty data loss on your hands, equal to 1-(1-D)x(1-D); about 0.2%, or twice the failure likelihood for a single drive. A terabyte drive would therefore need to be twice as unreliable as a 500GB drive to justify choosing the pair of 500's on reliability grounds.

If you're running two drives, the chance of them both failing on the same day (assuming that the cause is internal to the drive, and not something like the whole disk array falling off the bench or the power supply failing in a data-destroying way) is DxD. If they're configured as a RAID-1 set (total redundancy) then both drives must fail to make data non-recoverable. For our example choice of D = 0.001, the likelihood of total array failure due to internal drive failure has now been reduced to 0.000001. You get a really strong drive reliability multiplier effect from RAID-1.

RAID-5 trades off some of that for more capacity. If you had, say, four 320GB drives configured as a 960GB RAID-5 array, you would only lose data if two or more failed on the same day (assuming it takes you less than a day to replace a failed drive). The chance of two out of four drives suffering internal failure is roughly the square of the chance of one out out of four failing: (1-(1-D)x(1-D)x(1-D)x(1-D)) squared, or about 0.004 x 0.004, or 0.000016, or 0.0016%; about a 600 times reliability improvement on D.

If these figures are all you're going on, though, ignoring the possibility of power supply failure or shared mechanical damage will come back to bite you. This is why you will so often hear experienced professionals chanting the "RAID is not backup! RAID is not backup!" mantra. If you're around disk drives for long enough, you will see tears and wailing and rending of garments caused by people who fail to understand this.

The way to reliability is avoiding single points of failure. Make backups. Make at least two backups. Using hard disk drives for this is fine (and cheaper than CD-ROM, per gigabyte) but give them their own power supplies (e.g. use external enclosures), and take them offline and preferably offsite when you're not actively backing up to them.

Digitizing a music collection while spending as little money as possible? Just buy the big fat hard disk, unless the cost per gigabyte for multiple smaller disks is less and you feel like playing RAID-5 games, because the very media you're digitizing can be your primary backup. Digitizing a music collection is a fairly similar process to restoring from backups anyway; it's just slower. Automate the process to the greatest extent possible and keep the original media, even if you have to box them up and pay for offsite self-storage. That way, if your drives fail, you can do it over without too much pain. All you'd need to pay to back up, then, is any metadata you can't generate automatically. This will generally be much smaller than the music itself, and you can hand the backup problem off to Jungle Disk or Mozy for pennies.

But I expect that once you've actually gone through the process of doing a terabyte's worth of digitizing work, the extra money you'd need to spend on two more backup drives is going to start looking pretty small.
posted by flabdablet at 6:29 PM on April 27, 2008

« Older HRM With Interval Timer?   |   Seeking the Internet Clueless Newer »
This thread is closed to new comments.