Historical documents online?
September 8, 2011 7:37 AM   Subscribe

My company got lucky and we're working on a web site for an amazing historical society, showcasing a ton of documents from their collection. Do you have any examples of how historical documents are being showcased online? Do you have any suggestions for what they don't do that you wish they would?

The images we have are of very very good quality (recent archival scans) so we'd like to show them off as beautifully as possible.

The more links the better — even if the site isn't beautiful, but they have a fantastic feature you love, that's totally great too!
posted by o2b to Computers & Internet (19 answers total) 24 users marked this as a favorite
I used to work with a lot of historical documents (and for a not-amazing historical society) , but besides the big sites, I can't recall any archives that stick in my mind. What I would suggest, though, is really working on having good tagging or keywords attached to the objects so that searching or browsing is easy and not frustrating.

Since you're just making the website, you shouldn't be responsible for that content, but it should be something available for 'administrators', or whoever on their side is tech-savvy enough, to be able to get into and update or attach tags.
posted by cobaltnine at 7:49 AM on September 8, 2011

I really like reading Letters of Note. The idea of highlighting one document a week, with some kind of context added would work well, I think.
posted by raisingsand at 7:58 AM on September 8, 2011

If you can, indexing OCR/text transcriptions for searches would be absolutely wonderful.

I also like search results that link to either a page with a smaller image and a description, or just straight to the full-sized image in the browser.

I also like it when archives are well-indexed by Google, as I might not know your client exists when I want to find information they happen to have preserved.

The presentation outside of search is less important to me, but I prefer a tag system to folders so I can browse the archive multiple ways.
posted by michaelh at 8:04 AM on September 8, 2011

I would love for someone to contextualize the documents for users to be able to learn more.

For instance, linking together documents via time/events/person(s), including context to jump off to things like the wiki for how the actual historical synopsis involved, bios for the people involved.

Also I think having a toggle to see a text transcript of the document (better for reading).

I could go on, but turning an artifact from the past into a robust source of knowledge would be rad. You could honestly use a "wiki" sort of option to let people who are interested add metadata related to the documents: links to books, movies, documentaries etc.
posted by straight_razor at 8:29 AM on September 8, 2011

I really like the English Broadside Ballad Archive. What's really great is they allow you to zoom in really close to the image, and I like that they have transcriptions of the documents. Some of them even have recordings! Their advanced search is very user friendly too.
posted by apricot at 8:32 AM on September 8, 2011

I've looked at a lot of libraries doing this sort of thing with some degree of success. Here are some that I've liked

- Philly History - not only a great site but they've managed to interweave sharing historical documents with an actual revenue model that works. Yay!
- Our Brant wiki - combines historical photos with some other super useful stuff like cemetery records and allows people to add their own loca information. Very well-designed.
- American Memory at Library of Congress has great content (and I use it all the time) but the search features are ass and it makes finding stuff much much more difficult than it needs to be and the teeny gallery results are terrible.
- UVM's Landscape Change program showcases how the landscape has changed and they invite input from Vermonters
- Some libraries that do this well: NYPL, UNT (collecting content from many smaller institutions in one place), Yale, University of Washington
- some other decent examples and stories here

These are the specific things that I think a site needs to have

- images available in high and low res and easily accessible
- very very clear guidelines on copyright and who to contact for rights
- power search including being able to search for a bound phrase, and use boolean in some fashion such as AND, OR and especially NOT
- This should include being able to limit by facets [Open Library is one of the best examples of the facets being easy that I know of, so you can limit by years or subject or just "is available online"]
- I always like being able to see user generated content along with the original document, so comments like Shorpy does or something like how Smithsonian collects metadata [and then adds it to the permanent collection] using Flickr
- keyword searchability if you've got a lot of text. Library of Congress's newspaper project does this decently well
- If it's a really big dataset, have an API so that people can access the material in different ways
- Here are some best practices for digitization projects, as written by libraries
posted by jessamyn at 8:52 AM on September 8, 2011 [8 favorites]

Maybe a little too narrow but I like what the University of California did with their Mark Twain Project.
posted by Toekneesan at 9:39 AM on September 8, 2011

Not a lot of time to post so I'll just provide a few links to sites that received some attention (positive) from the archives community:

Polar Bear Expedition Digital Collection. You can google Elizabeth Yakel + Polar Bear to find some papers about the site.

Smithsonian Archives of American Art Digitized Collections: important because it utilizes the metadata from the EAD finding aids to produce information at the folder (aggregate) level [which is how archives generally work], rather than the item level.

Amsterdam Archives

Heritage Burnaby (Canadian) and some background info on the project.

If I can think of some others, I might dip back.
posted by kaybdc at 9:51 AM on September 8, 2011

You can see parts of the St. John's Bible online: http://www.saintjohnsbible.org/see/ I think they do a good job displaying it.

I realy wish more online catalogs of digital documents were set up for casual users and not archivists (as sopposed to, say, the National Archives of the Library of Congress, last time I used either). Some canned, popular searches would be good for new visitors, and as examples for those wishing to try their hnd at searching. Exposing the collection to Google Images would be nice, if you get the chance to do so. Providing clear instructions & pricing for ordering reprints -- on every page -- would be good.
posted by wenestvedt at 10:04 AM on September 8, 2011

Lots of great information, thanks so much everyone! This is totally helpful.
posted by o2b at 11:19 AM on September 8, 2011

@kaybdc You mention the "archives community" — is there a place where this community hangs out to share links and/or best practices and such?
posted by o2b at 11:37 AM on September 8, 2011

the main professional organization for archivists in the U.S. is the Society of American Archivists (SAA). There is an associated listserv (scroll down to the end of a page for a link to the actual listserv), as well as facebook and twitter accounts. Check out who SAA is following on twitter. ArchivesNext and Spellboundblog usually post some interesting links. You can also search for tweets with hashtag #SAA11 to find out what archivists were tweeting about during the recent annual conference (just 2 weeks ago).

I checked your location and as you are in NYC, there is a very active local group, The Archivists Roundtable of Metropolitan NY. I believe that they meet monthly and you could just show up.

Are there any archivists or even archives students working/interning at the historical society for which you're working? I'm sure they'd love to share some ideas with you.
posted by kaybdc at 12:14 PM on September 8, 2011

With regard to best practices for digitization, the NHPRC (National Historical Publications and Records Commission) which provides grants for making historical documents accessible, has a full page of links regarding aspects of digitization projects, including best practices as well as examples of projects that they've funded.
posted by kaybdc at 12:38 PM on September 8, 2011

Look into Islandora as a software solution. Most current digital library options are rather awful/clunky/expensive. UPEI/Islandora seem to be onto something. Islandlives.ca is a decent working example. Specifically look into the titles section of the site. Good luck.
posted by professorpotato at 1:51 PM on September 8, 2011

Thanks again kaybdc! Digitization is out of our hands (fortunately) — but the client has hired the very best, from all accounts. We're talking about hundreds of thousands of (highly valuable) documents.

@professorpotato: They have already decided to use Alfresco to manage the collection; we will be adding a layer of Drupal for the web site. Any thoughts on working with Alfresco?
posted by o2b at 2:32 PM on September 8, 2011

NYPL is one site I've enjoyed using.

Here are some things I think you would need:
- Direct search to find exact matches.
- Keyword searches / tag browse to explore.
- Category browse to explore / find what your looking for.

I think if you're missing any of these you're toast. If I were organizing historical info, I would want to be able to browse by date ranges, events, themes, etc. You probably also want to bubble up "featured" info.
posted by xammerboy at 2:33 PM on September 8, 2011

Have you seen what NYPL has done with their archives of the 1939-40 World's Fair? It's available as a slick iPad app and as a website. I really enjoyed the iPad app, and it finally demonstrated the utility of that device to me.
posted by ikahime at 6:59 PM on September 8, 2011

What they said, plus the ultra-zoom feature. The magazines on issuu all feature a 'click-to-zoom' and 'full-screen' way of showing off material. Context is key as well - anything written in a foreign language (or cursive) HAS to have a plain-text version.
posted by chrisinseoul at 7:36 PM on September 8, 2011

Protect your copyrighted commentary if you like. That's your prerogative. But don't try to prevent your scans of historical documents or photographs from being reused.
  • Make everything easy to screen print, copy and paste, or just right-click and download.
  • Don't disfigure photos with watermarks.
  • Put the highest resolution scans you have online for downloading (even if your pages display low-res versions for normal browsing).
  • Offer zipped collections of material available for downloading and offline browsing.
  • Encourage redistribution through other channels such as torrents.
It's everyone's history. Don't squat on it.
posted by pracowity at 3:27 AM on September 9, 2011 [1 favorite]

« Older Broken Fridge maybe/maybe not?   |   I want to create artwork based on copyrighted... Newer »
This thread is closed to new comments.