In 2022 how do I want to archive data for the long term, offline?
I currently just keep an old fashioned spinning hard drive (two, if I'm counting) in my safe deposit box, but storage has changed a lot in the last decade. What's the current wisdom for long-term (stable) data storage?

  • Not stored on other peoples computers (The Cloud).
  • Roughly a terabyte.
  • Can be stored in a climate controlled box for a decade without power and still be expected to mount when plugged in.
  • Will be accessible by whatever computer and OS I have in a decade.
  • Is a couple hundred USD at most. (I'll likely buy more than one for redundancy.)
It does NOT need to be particularly fast, but faster than searching through a stack of CDRs.
posted by Ookseer to Computers & Internet (8 answers total)
There is no such thing as long-term stable data storage. Storage for digital preservation is a relay race, not a marathon.

If the cloud is not appropriate, then your hard-drive strategy will work, but you need your hard drives to be in two different places (to hedge against "tornado or flood takes out the bank building" risk), you shouldn't plan for any given drive to last much longer than three years, and you should ideally be spot-checking those drives a couple times a year.
posted by humbug at 2:27 PM on August 4

Best answer: I think HDDs are basically the best you can do here. Large companies use tape backups, and I have some friends who do that, but it's a pretty big pain in the ass and kinda expensive. If you're storing stuff you really care about, store it on multiple drives, and replace every ~5 years, is I think best practice.

Here's Archive Team's advice.

M-DISC is another option which seems like it might be good, but I haven't really heard of anyone using it so I don't trust it much (and it's very expensive compared to HDD).
posted by wesleyac at 2:29 PM on August 4

Redundancy: in that budget, your box could contain two or three modestly-priced SATA spinning disks, say 2/3/4TB, with multiple copies of the data on them, ideally in checksummed/error-correcting filesystem like ZFS, plus a SATA-to-USB adapter plus a power adapter.

Alternatively, a pair of 1 and 2 Terabyte SSD's (again using an error-correcting filesystem) will sit in your budget and won't need a power supply with the USB adapter.
posted by k3ninho at 2:44 PM on August 4

Addition: I can't think of an organisation that would have benchmarks for the longevity of this kind of archive. SATA and USB are longest-standing domestic interfaces with 15-25 years of use. SSD's are too new a technology and long-serving spinning disks are assumed to be kept warm and spinning. Padding around the disks is something you shouldn't overlook.
posted by k3ninho at 2:51 PM on August 4

SSDs aren't currently recommended for long-term storage. The data on them slowly degrades if they're not powered up. (the flash cells slowly lose charge, and normally the internal SSD controller rewrites any blocks that are at risk, but with the drive powered off that never happens.)

I think your hard drive solution is probably about as good as it's going to get. Maybe a second drive with another copy of your data would be prudent, if you really don't want to lose it.
posted by neckro23 at 5:08 PM on August 4

As a rule of thumb, you shouldn't do long-term backups on anything that has moving parts. If the motor or electronics fail in your hard drive, it's just as dead as if the data was corrupted. A DVD may require you to get a new reader, but so far those are pretty good about backwards compatibility.
I use DVDs and write them so nothing else can be added or erased. There were doubts about their longevity at first, but I have CDs which are ~40 years old. I can still read them.
The new M-Disk format is supposed to be good for a thousand years but it obviously hasn't been tested.
For any kind of media the storage conditions are probably the most important thing. I keep mine in my desk drawer, which isn't ideal, but it's pretty good. I also don't write on the disks - I was told by a backup specialist a long time ago that you don't know what that ink will do to the surface over a decade. If I have to I write on the clear area near the hub.
It's wise to have two copies of stuff you can't afford to lose, and three is better. I was once on the last of three backups before I found a good copy of my data, so I wasn't unemployed and the company didn't go under. I worked for someone who kept one set of backups and they died.
Don't forget offsite backups. A friend had a very good set of backups hidden in the pocket of a suit in his closet. The thieves took all his computer gear and his suit. If a fire could wipe you out, you don't have good backups. If someone who doesn't like you could, same thing.
Check your backups to see if you can read them, as soon as they're made. I worked for a company which did offsite storage in an old nuclear bunker. When our server went down there was no data on the expensive backups. They'd never tried to recover anything. This is a rookie mistake, like making a cake and forgetting to turn on the oven.
I could go on at length, because I've seen a lot of backup catastrophes. One company lost everything because tapes were too expensive at $12 each. Another overwrote them because there wasn't a problem they wouldn't find in a week. One didn't do offsite backups because the boss didn't want anyone else having the tapes and the IT guys said he was too irresponsible to take them home. Another had two offices and the remote one didn't get a backup system because the only DVD drive they had didn't look good in the server. A small company built a RAID serve that was so perfect that it couldn't fail. It was so hot it cooked the tape drive, and then everything died.
I don't think you should look for a permanent, one time solution, or think you're done. I worked for an accountant whose sole backup was a disk two years old. We couldn't read it.
I make a new backup every so often, and I keep the old one. I keep really important data on a USB and carry it with me. If it's sensitive I encrypt it. I also burn a DVD every couple of months, but in situations where things changed quickly I've done it as often as daily.
If your original media is getting hard to find or outdated, it's time to copy it onto something new.
posted by AugustusCrunch at 5:40 PM on August 4

Yes I read you were not interested in online but options are complicated. And I was curious. So... looks like Amazon S3 Glacier Deep Archive is about $12/year for 1TB. It's complicated and retrieval can be costly if not done with care. But as much as the corp is a hot button it has better security and reliability than almost any other option.
posted by sammyo at 7:27 PM on August 4

> Here's Archive Team's advice.

From TFA, emphasis added:
Compact Disks, Digital Video Disks, High Definition DVDs (HD-DVD), and Blu-Ray (BD) are all optical media. This means that information stored on them is read by a laser. It is debated how long the shelf lives of these products are, while commonly accepted that DVDs have a shelf of 20 years, and CDs have a shelf life of 3, it is unknown obout the others (some of these technologies haven't existed long enough to know this).


Compact Disks (CD)

Professionally produced CDs (such as those you would by from a store) are accepted at having a shelf life of up to 7 years. CD-Rs and CD-RWs, however, have a shelf life of only about 3 years. They store only up to 700 MB and have only a single layer. Not recommended by today's standards.

The bolded parts are utter nonsense, anecdote time:

- I, and undoubtedly millions of others, own several 30+ year old audio CDs that play fine and can be perfectly ripped (the scarce few pressed CDs that give me issues were all manufactured this century)

- in the recent past I went through the effort of cataloguing my optical media archive, consisting of 500+ CD-R and DVD(+/-)R media, the vast majority of which burnt 10 to 15 years prior, and only 1 (one) of them had unrecoverable corruption (in a single JPEG file, everything else perfect) while less than 5 others required a little effort with ddrescue to dump (less than 20 minutes each). Also of note is it wasn't all Taiyo Yudens and Verbatims, about 1/4 to 1/3 of those recordables were whatever cheapest media I could purchase at the time.
posted by Bangaioh at 4:55 AM on August 12

