Join 3,512 readers in helping fund MetaFilter (Hide)


Will my Facebook photos still be around in 600 years?
August 31, 2010 9:35 PM   Subscribe

Can you point me to any good writing (research-based, speculative, or heck, even fictional) about the potential historical permanence or impermanence of digital media? I'm interested in both professional (theatrical movies, book publishing, maps...) and personal (digital snapshots, Livejournal, email...) forms of digital information storage.

I currently believe, perhaps wrongly, that digital media are problematically impermanent in the long-term (on the scale of hundreds or thousands of years - maybe a little less so on the scale of a lifetime). But I really don't have much evidence for this view beyond an intuitive sense that information is much more safely stored as physical objects than as 1s and 0s that leave no physical trace on their shells (but I also admit to a rather limited understanding of how current storage technologies work or where they are going).

But surely, if this is actually an issue, there must be people doing something about it or thinking about it. Are there computer engineers working on it? Are there archivists writing theoretical articles about it? Is it possible that someday someone will dig up a cache of early 21st century hard drives and still be able to access the information on them, as people in the 20th century were able to discover modern and not-so-modern information that was still accessible?
posted by bubukaba to Technology (8 answers total) 8 users marked this as a favorite
 
Not writing, but you want to listen to Clay Shirky speaking at The Long Now Foundation: Making Digital Durable: What Time Does to Categories. You might also look at The Internet Archive. You also want to be aware of The Archive Team.
posted by artlung at 9:55 PM on August 31, 2010


The Long Now Foundation itself has a metric crapton* of scholarly research on the subject under the "Links" section about halfway down this page.

*actual scientific term
posted by Rhaomi at 10:09 PM on August 31, 2010


The first PDF at this link, "Ensuring the longevity of digital documents" is a Scientific American article from 1995 that lays out some of the main issues pretty nicely. But rhaomi is right, there is a metric crapton of literature on this subject, so your concerns are well-founded =)

Also, I just did a Google search for "digital dilemma" (I was looking for a book the Academy of Motion Picture Arts & Sciences' Science and Technology Council issued a few years back about this issue with regard to the field of moving image archiving) -- anyway, that phrase brought up a huge number of other articles and websites with the same title, so you could probably just start there.
posted by estherbester at 10:38 PM on August 31, 2010


I'm not surprised that there's so much out there on the subject. Fascinating stuff so far.

A followup question, given the quantity of material on the subject: what is the consensus about this, if any, in the various fields with a stake in these issues? (I know that in my limited corner of the film world, for example, it's basically in line with my intuitive feeling articulated above).
posted by bubukaba at 11:00 PM on August 31, 2010


what you might look into is 'information theory'...one of the most fascinating fields around...information is inherently fragile, digital or not, for reasons mostly arising from the concept of 'entropy'...the most important fact is that ordered states are VASTLY outnumbered by disordered states...imagine a picture puzzle...there is only one arrangement of the pieces where they are in order (when the puzzle is put together), but every time you shake the box you make a new, disordered arrangement.
posted by sexyrobot at 11:32 PM on August 31, 2010


One of the more well known cautionary tales is the Digital Domesday Book project which was thought to be completely unreadable after only 15 years, because the specialized hardware necessary to access the data became obsolete and were either discarded or left to decompose. However, with a lot of hard work a team wrote an emulator and extracted the data from the disks so that it's now viewable on standard PCs.

This case highlights the issue of data stuck in oddball proprietary devices/formats, which is certainly a big problem, but with the rise of the internet and the web and open standards it's becoming less of an issue. I mean, if I had to choose a handful of formats/standards for which the distant future would be most likely to have extant decoders/renderers, then JPEG and HTML would certainly be at the top of that list. Of course, there's an argument that modern use of HTML these days is so encumbered with Javascript and CSS that it's intimately tied to specific browser implementation quirks, but at least for the purpose of simply extracting the structure of information contained on a page you can get by without those details. Anyway, that just shifts the emphasis onto the fact that physical storage devices don't last very long and require constant re-copying of the data, even if the format remains open and ubiquitous.
posted by Rhomboid at 2:22 AM on September 1, 2010


www.digitalpreservation.gov is the Library of Congress website dealing with this topic.
posted by Daily Alice at 6:08 AM on September 1, 2010 [1 favorite]


This is something the library and archives communities have been focusing on for years. Basically, the short version is: the more your digital content is in open, non-proprietary formats, the more likely it is that you'll be able to read it down the road. There are some proprietary formats like Microsoft Word documents or PDFs that are probably "too big to fail," in the sense that they are so ubiquitous that if the companies that created them ever went out of business, a solution would have to be found for reading them. But everybody who's been around on computers for more than a few years has had the experience of using some word processing program that can't be converted to anything currently readable.

In addition to file format issues, there is the issue of storing your content with a company that doesn't have any interest in its long-term preservation. Photos that are stored with Facebook are entirely dependent on Facebook staying in business. And someday it will fail or go the way of once-popular websites like Geocities or Friendster. If your data is stored with a company and they don't provide an easy way to get that data out, you're screwed. Making your own backups, sticking to open formats like XML and TIFF, and depending as little as possible on other people to keep your content alive are the best practices.

So there are two answers to your question.
1. Keep your information in open formats or in ubiquitous commercial formats
2. Don't trust the long-term storage of your items to private companies, which could go out of business or disappear, taking your stuff with them.

For further reading, here are a selection of articles:

A blog post about digital preservation in the world of law libraries


One of many D-Lib articles on Digital Preservation


An example of libraries developing preservation planning policy

posted by MsMolly at 8:19 AM on September 1, 2010


« Older Where can a college student bu...   |  I'm looking for research on lo... Newer »
This thread is closed to new comments.