Best way to scan and OCR a moderate number of pages
October 9, 2014 4:50 PM   Subscribe

I have ~9,000 pages I'd like scanned and OCRed, spread across 20 books. What will this cost me?

There are no digital versions available, unfortunately. Destructive scanning is totally fine. I'm also willing to buy a scanner and do it myself if necessary, if that's likely to be significantly cheaper / provide better results. I don't need it proofed / formatted into a nice ebook version.
posted by dilaudid to Technology (6 answers total) 3 users marked this as a favorite
 
Another vote for the ScanSnap series for high-volume scanning. I have not scanned books with it, but have scanned almost everything else over the last 5 years and it is a breeze.
posted by Le Ton beau at 5:30 PM on October 9, 2014


Also, if you end up going the DIY route, I have heard that some print shops will guillotine-cut your bindings off for a small fee. That may be worth calling around. It's probably less likely than a band saw to leave rough edges that could jam up a scanner.
posted by Le Ton beau at 6:17 PM on October 9, 2014


Looks like archive.org has a good scanning service
posted by Sophont at 9:35 PM on October 9, 2014


DIY


You would be surprised how many books are available digitally.
posted by yoyo_nyc at 4:16 AM on October 10, 2014 [1 favorite]


If you're in the US, I've heard lots of praise for 1 Dollar Scan, but I've yet to test their services myself. The price seems to be at $1 USD per 100 pages plus shipping
posted by andycyca at 7:55 AM on October 10, 2014


So far, no-one has addressed your issue of optical character recognition (OCR). In my limited experience, scanning into PDF files is easy -- converting those scans accurately into text is hard. Is that aspect of it important to you? Maybe someone with more experience can chime-in and give some tips on good OCR packages.
posted by alex1965 at 9:36 AM on October 10, 2014


« Older Cheap indoor "searchlights" for a party   |   I need pants pockets my hands or keys fit in. Newer »
This thread is closed to new comments.