How best to share a shedload of 100MB images?
December 2, 2010 8:42 AM   Subscribe

Could anyone help an academic history of the book / digitisation project by sharing their experiences of collaborative work on huge image files via the cloud?

As part of my current research project with Princeton University Library to digitise a rare book and its contents before it physically disintegrates, it's necessary for me to be able to share online the very high res images that Princeton have produced for us with collaborators at universities in the States and mainland Europe.

Previously in such situations I've had success using Dropbox. However, the size and quantity of the files involved - each individual file is c100MB, the entire directory is c.115GB - means this just can't be done. The Arts Computing Service at my university can't handle it inhouse either. I've been looking into Picasa and Amazon S3, but have certain reservations and wondered if the community had any alternative suggestions or experience of similar logistical problems.

There's funding available for hosting costs - though this is Britain, so not much. We'd need to be able to set it up in such a way that only team members could access the images, and our partners are very wary of anything that could raise intellectual property issues.

Ta!
posted by bebrogued to Computers & Internet (8 answers total) 1 user marked this as a favorite
 
Is this something that needs to be online? If you won't be sharing with the general public, or even the general population of your schools, could you load the files onto an external hard drive and send it across the pond to them? Then they could host the files internally on their library's internal file server.
posted by MsMolly at 8:53 AM on December 2, 2010


Given your info, I would choose to burn sets of 5 blueray disks and mail them off to your project partners; set up a wiki for whatever collaborative work the teams are doing (that would enable sharing snapshots and discussing in text). Bonus of going this way is avoiding frustration with download times required.

Alternatively, fire an FTP server straight from your on-campus computer, or look into Dreamhost file storage (a different service from Dreamhost web hosting).
posted by Jurate at 8:55 AM on December 2, 2010


Ask ibiblio.org: they are good about hosting stuff for free (like my paper about my grandpa's WWII service), and I *think* you could put an .htaccess file on there to keep out the general public.
posted by wenestvedt at 9:24 AM on December 2, 2010


At the risk of a bit of a derail: what IP issues are we talking about here? If this is a rare book, and old enough to be physically disintegrating, what is the current copyright status? If it's out of copyright, is Princeton claiming copyright on the scans themselves?

I ask because sharing openly isn't only The Right Thing, it's also the easier thing. Amazon S3 was literally created for exactly this sort of large-file sharing. I was also just introduced to a new file-sharing service called ge.tt (http://ge.tt/) that SAYS no size limit, and begins sharing AS files are uploaded, so you don't even need to get the whole file up before someone can start downloading.

If you need a name/introduction to ibiblio and want to pursue that option, LMK. The director was my advisor.
posted by griffey at 9:49 AM on December 2, 2010


I have worked on a few book digitization projects. One of the projects that I worked on last year used 4Shared to share high-resolution book scan images. I was not the person who set up the file sharing, though, and I don't know whether it would meet your needs—the "premium" virtual drive is listed as 100GB, which falls short of your requirements.

You might want to ask this question in one of the digital humanities fora, though I'm not in close enough contact with the field to know where to send you.
posted by Orinda at 10:13 AM on December 2, 2010


Metafilter has a book-digitizing guru dude, I forget his nick, it's a one syllable un-capitallized word, IIRC.
posted by StickyCarpet at 1:23 PM on December 2, 2010


Oh yeah, it's fake. Sit tight and hope he shows up here, or get bold an memail him?
posted by StickyCarpet at 1:25 PM on December 2, 2010


Response by poster: Many thanks for the replies! We already have the images on hard drive - we need to be able to access, edit, and share the files as far as possible, all from three separate locations.

Just on the IP side, I really didn't explain that at all well. Two quick points. First, we don't want any suggestion that the service provider would gain any interest in the IP of anything we upload. Doesn't seem to be much of a worry really from what I've seen, but I need to be able to quote chapter and verse from the T&C.

Second, a previous project of ours went wrong when our principle partner - one of the most prestigious learned societies in the world - got into an absolutely scandalous deal with Microsoft to provide all their software solutions. We're fully committed to an open source ethos. These two things didn't mix.
posted by bebrogued at 4:30 PM on December 2, 2010


« Older (Another) Why is my ConEd bill so high?   |   Warmest Hat Ever? Newer »
This thread is closed to new comments.