Digital Photography Basics and OCR
March 24, 2007 10:24 AM Subscribe
Figuring out digital photography to support an OCR project
My work just bought a new scanner that uses two SLR cameras to do the imaging. I've been charged with setting up the system for using this scanner to digitize and then OCR a book collection. I am a total novice about digital photography, so I am looking for a good site/book/paragraph to explain the basic details that I need to know (like lighting, lenses, camera speed etc) for setting up the cameras. Secondly, I am also trying to figure out how to set up the dimensions and formats for the images in order to produce an optimal PDF that can go through the OCR process. At the present, the cameras are outputting the images as JPEGs, each about 3MB/page and roughly 3000 x 2000 pixels. I know that these dimensions are too large for a letter sized PDF, so I'll need to resize them in a batch process in Photoshop to the correct dimensions without losing clarity. The minimum level dpi for OCR is about 350, so I'm wondering how to ensure I hit that threshhold after the resizing.
posted by gov_moonbeam to computers & internet (6 answers total) 3 users marked this as a favorite
I've scanned a lot of books. This is using consumer-grade equipment/programs. My workflow was always: scan to tiff, batch import tiffs to omnipage, OCR, convert to PDF. Omnipage can be setup to do it all automatically (basically one-click when you get it setup right).
Just out of curiosity could you post a link to the scanner. I've never seen one that uses SLR cameras for scanning.
posted by i_am_a_Jedi at 11:43 AM on March 24, 2007