Recommendations of software/hardware combo to scan magazines
December 26, 2009 7:32 AM Subscribe
I have 60 to 100 magazines. I want to scan them all completely and then use OCR software so that they would be searchable. I am looking for recommendations on the scanner I should use, as well as the OCR software. I have a flatbed scanner, so with good software, I could probably take the time to do it, but a scanner that would let me feed in several pages at a time would be best. WinXP is the OS of my computer. Ideally, I would want to spend no more then $300, with $500 as an absolute max.
A number of multifunction printers (the print/scan/fax/copy/etc things) will scan in bulk for you if you cut the bindings. (here's an example that you can find for around $99) OCR comes with.
If you're a DIY person take a look at a reasonably decent home made book scanner here.
And before you buy anything be 100% sure that you can't just get a digital copy somewhere, if not directly from the publisher, then search Google Books for 'em.
posted by Ookseer at 10:13 AM on December 26, 2009
If you're a DIY person take a look at a reasonably decent home made book scanner here.
And before you buy anything be 100% sure that you can't just get a digital copy somewhere, if not directly from the publisher, then search Google Books for 'em.
posted by Ookseer at 10:13 AM on December 26, 2009
I second Brian. I havent seen a better implementation of scanning than Fujitsu Scansnap for an application like yours.
posted by london302 at 10:17 AM on December 26, 2009
posted by london302 at 10:17 AM on December 26, 2009
Depending on how mechanically inclined you are, you may want to build your own scanner. It automatically flips the pages for you but I'm not sure if this scanner would be right for you because you're working with magazines.
It's been done (I'm thinking about doing it when I have the money) and there's a support forum too with other builders.
On the software side, you may want to use post-processing software (that rotates the images, centers them, sharpens the contrast in case the letters appears fuzzy, etc) depending on the quality that you'd like.
For post-processing on XP, I'd recommend:
scamkromsator Beware !! The author's website has pop-up ads and tries to install shady add-ons in firefox
Here's a link to bypass it and just download the software (FYI: I've only used the software in wine but scankromsator has been mentioned a bit in the diybookscanner forums, just be careful that the file is virus-free.
A very sophisticated program. and here is a is a good introduction (translated from Russian)
on how to use this program and all of the options in this program. I found it helpful and more feature filled than...
Scantailor (for also Mac OS X and Linux)
Doesn't have much documentation on the website (and what it does, it's mostly in Russian...)
My personal experiences: It doesn't compress TIFFs that have images well. I had unprocessed 1.4mb TIFFs and after processing using scantailor, the TIFFs that contained images were 15-20mb, while ones with just text were still only 1mb or so.
Now, for the OCR'ing:
There's a couple popular ones :
- Abbyy and Adobe's Acrobat. I haven't used either of those, can't give an opinion on it. I use Document express editor which I don't think is sold anymore...
posted by fizzix at 10:37 AM on December 26, 2009 [1 favorite]
It's been done (I'm thinking about doing it when I have the money) and there's a support forum too with other builders.
On the software side, you may want to use post-processing software (that rotates the images, centers them, sharpens the contrast in case the letters appears fuzzy, etc) depending on the quality that you'd like.
For post-processing on XP, I'd recommend:
scamkromsator Beware !! The author's website has pop-up ads and tries to install shady add-ons in firefox
Here's a link to bypass it and just download the software (FYI: I've only used the software in wine but scankromsator has been mentioned a bit in the diybookscanner forums, just be careful that the file is virus-free.
A very sophisticated program. and here is a is a good introduction (translated from Russian)
on how to use this program and all of the options in this program. I found it helpful and more feature filled than...
Scantailor (for also Mac OS X and Linux)
Doesn't have much documentation on the website (and what it does, it's mostly in Russian...)
My personal experiences: It doesn't compress TIFFs that have images well. I had unprocessed 1.4mb TIFFs and after processing using scantailor, the TIFFs that contained images were 15-20mb, while ones with just text were still only 1mb or so.
Now, for the OCR'ing:
There's a couple popular ones :
- Abbyy and Adobe's Acrobat. I haven't used either of those, can't give an opinion on it. I use Document express editor which I don't think is sold anymore...
posted by fizzix at 10:37 AM on December 26, 2009 [1 favorite]
2nding the Scansnap.
One possible problem, magazine pages are extremely thin, you may have bleed through of the opposite page when scanning. Usually you can fix it by placing a solid colored sheet of paper behind it. You might run into that problem using the Scansnap though.
posted by wongcorgi at 10:58 AM on December 26, 2009 [1 favorite]
One possible problem, magazine pages are extremely thin, you may have bleed through of the opposite page when scanning. Usually you can fix it by placing a solid colored sheet of paper behind it. You might run into that problem using the Scansnap though.
posted by wongcorgi at 10:58 AM on December 26, 2009 [1 favorite]
@ookseer - a lot of those inexpensive multifunction printers only have a <50 ADF capacity (the Brother you linked only does 15 max) and pretty much none of them scan in duplex.
posted by wongcorgi at 11:01 AM on December 26, 2009
posted by wongcorgi at 11:01 AM on December 26, 2009
Why the elaborate setups? This is very simple, I've done it many times.
Step 1: Go to kinko's or any local copy shop with all your magazines. Ask them to cut the bindings off. They can do this for up to encyclopedia sized books, so magazines are no problem.
Step 2: Drop the magazine into a scansnap scanner and scan and OCR. Or if this is a one-time thing, have kinko's scan them in for you. Ask Kinko's to OCR, or do it after they've given you the PDF's with Acrobat or similar.
posted by Merlin144 at 12:03 PM on December 26, 2009
Step 1: Go to kinko's or any local copy shop with all your magazines. Ask them to cut the bindings off. They can do this for up to encyclopedia sized books, so magazines are no problem.
Step 2: Drop the magazine into a scansnap scanner and scan and OCR. Or if this is a one-time thing, have kinko's scan them in for you. Ask Kinko's to OCR, or do it after they've given you the PDF's with Acrobat or similar.
posted by Merlin144 at 12:03 PM on December 26, 2009
Search the usual shady areas of the internet for torrents/usenet postings/etc of these magazines. Someone may have direct digital copies that will look much nicer or someone may have already scanned their collection.
Failing that, I can vouch for a SnapScan, been using one with my Mac for years to store bills and school papers without all the paper.
posted by Brian Puccio at 8:03 PM on December 26, 2009
Failing that, I can vouch for a SnapScan, been using one with my Mac for years to store bills and school papers without all the paper.
posted by Brian Puccio at 8:03 PM on December 26, 2009
For software, check out OCRopulus. It's the open source software Google has been developing for their massive book scanning project.
posted by gus at 8:41 PM on December 27, 2009
posted by gus at 8:41 PM on December 27, 2009
« Older Have you had success returning blank CD (or DVD)... | Where to find volunteer pen-pals for elders? Newer »
This thread is closed to new comments.
posted by brianogilvie at 7:54 AM on December 26, 2009