I want to aid the Archives, but how do I scan a magazine?
October 9, 2017 2:00 PM   Subscribe

I found that Archive web site no longer just does sites, but archives full magazines with readable text. Some issues are missing that I have. How can I add them?

Recently I was researching a project with Google (who doesn't?). It sent me over to the web archive, which I always used as the "Wayback Machine" to search old web sites. But now I see they're archiving entire magazines. Specifically those of interest to me were Fangoria.

But in looking I found they were missing a few issues from their catalog. They simply haven't been scanned and uploaded it seems. I have these issues and I want to help upload them, but I can't find any resource on how best to scan them.

I have a scanner I could use...but really I'd rather try to use my phone if possible. The scans on the site currently are very low res, and I found apps like Google's PhotoScan to give me as-good-as or better-than results than my scanner--especially when dealing with magazine print.

I'm more than happy to post-process, OCR, combine photos, etc. on a desktop, I'd just prefer the ease and speed of a camera phone to a slow scanner.

But I'm sure I need to make these into a PDF, keeping layout, but including OCR. I don't know of any apps that can do that, scanner or phone.

Has anyone done this and have any tips for me?
posted by arniec to Technology (4 answers total) 5 users marked this as a favorite
Assuming you are talking about the Internet Archive, they have a help page describing how to contribute content:
posted by gyusan at 2:24 PM on October 9, 2017

Sorry, I misread your question and I posted too fast. When I have scanned for inclusion in large newspaper projects we always scanned high-res tiff files, and used adobe acrobat to ocr and turn into pdfs. There are alot of moving parts in a large scanned print archive, it might be best to ask the Internet Archive directly what they'd prefer (or if they prefer it at all).
posted by gyusan at 2:29 PM on October 9, 2017

They also have a FAQ page about scanning https://archive.org/about/faqs.php#Books_and_Texts They automatically OCR the text when you upload it.
posted by interplanetjanet at 3:58 PM on October 9, 2017

Best answer: We derive the OCR internally. If you'd prefer not to make a PDF, you can upload an appropriately-named zip or cbz-like file full of images in correct order alphabetically (e.g. page001.jpg, page002.jpg -- see here for some technical details) and we will turn it into a PDF/other formats as shown here. Once it's uploaded, contact us at info@archive.org to have it added to the correct collection (you can also hit us up there with any other questions, though response time might be slow as our big annual event is this week). Thank you for contributing!
posted by j.edwards at 4:00 PM on October 9, 2017 [18 favorites]

« Older Not quite Flexitarian, but a lot less meat   |   What is the easiest method to transfer Google... Newer »
This thread is closed to new comments.