I work for a small non-profit that has had six linear feet of historical records on paper in somebody's basement for the last few decades. Can the hive mind (1) recommend a company that can scan/OCR these papers for us and (2) recommend resources for teaching me how to design and implement a document retention policy? [more inside]
I've inherited a ton of PDFs of scanned documents that are somewhat readable. The source were actual documents scanned into a scanner. Now I'm trying to make the OCR'd text readable by screen readers in one way or another. Looking for advice or even keywords to Google. Situational specs ahoy! [more inside]
I'd like to start putting the details of all my grocery receipts into a spreadsheet. What's the simplest and quickest way to do that? [more inside]
I'm looking for a service like Shoeboxed or MakeSpace, where I can get several decade's worth of my parents' financial records scanned/indexed and then sent back to me to go into a regular storage unit. Extra snowflake information about sibling's information security paranoia. [more inside]
I have ~9,000 pages I'd like scanned and OCRed, spread across 20 books. What will this cost me? [more inside]
I have a scanned document with some text that I'd like to edit. [more inside]
I would really like to be paperless. As soon as I receive them, I'd like non-junk mail and documents to be digitized, then shredded. [more inside]
I have a dead-tree book that I want to use OCR software on, but it has a strange font. My attempts so far have not been very successful. [more inside]
What is a good substitute for the Fujitsu Scansnap S1500M? [more inside]
I have a document in English of about 1500 pages that was originally derived from OCR scans of varying quality. I've proofread manually and by spell-check. What is the best strategy to eliminate the remaining errors? [more inside]
So what's the market look like for Japanese OCR software for Mac OS? I'm a bit bewildered and feeling out of my league upon googling around. It would be rad to be pointed in the right direction on this.
Yesterday I picked up a piece of ceramic bric-à-brac promoting RCA's Electronic Data Processing division (active mid-1950s to 1971). It features ten different ways to represent data or algorithms, of which I recognize many, but not all. Can you name the rest? Bonus: Can you decode the ones with actual data? [more inside]
Does anyone have any recommendations for OCR software that focuses on the "recognition" part? [more inside]
I like playing around with text recognition algorithms, but am stymied by a lack of a good corpus to train and test my code against. I'm looking for a large number of images of individual printed letters, labelled with the correct letter. (With the letter in the file name, or each set of letters in a directory, or something equivalent like a metadata file.) Something like this, but more of it.
Does anyone have any recommendations for a free or low cost solution that will let me scan to searchable PDF using OCR? [more inside]
Let's say there is a database that I can only access from a front end tool, and that database cannot provide any extract of any sort. I also have no access to run reports off that database. And let's say that the only way to preserve that data is to print to pdf or do screen shots. btw- all legit and you are not helping me do something malicious. [more inside]
How do I prevent OCR on a document (typically a PDF but I could use another document format if necessary)? I know that when I scan it from a hard copy to a PDF I can disable/stop the OCR process, but Adobe allows it to happen on any PDF I scan in, whether OCR was eliminated at scanning or not, and I have to stop that (I have work product I'd like to distribute electronically, but my boss would like to make sure it's not searchable and it's as hard as I can make it to copy). I can use any software or process within reason.
I have inherited a 1,000+ pages of my grandmother's writings. I would like to scan them, OCR them and (after fixing OCR mistakes) share them with the rest of my family online. My question is this: what's the best way to scan so many pages? Also, I should point out that many of the pages are on thin typing paper. Maybe this is carbon paper? or onion skin paper? I'm not sure but I don't want to damage the originals. [more inside]
Need to get data from hundreds of pages of bank records into a spreadsheet. We have a scanner with a document feeder, but would love some recommendations on software/workflow ideas. [more inside]
I've got a Scansnap printer and I'm ready to go paperless -- but I need the right Mac software to manage my scanned documents. None of the options I've found seem quite right. [more inside]
I have Adobe Acrobat X Pro on Windows 7. Is there any free or inexpensive way to use OCR to create searchable image PDFs from image-only PDFs of texts written in German in Fraktur/Blackletter script? [more inside]
We're looking for an OCR program that will handle batch processing and columns automatically. [more inside]
Looking for hardware/software that will scan a physical form with handwritten fields and generated accurate delimited text with it. [more inside]
What are my best options for digitizing a large collection of business cards? Bonus points for being able to integrate into MS Outlook. [more inside]
My (very) small office just got a new whiz-bang scanner that can scan stacks of paper to image-only PDFs. Ideally, I'd like to use this to do away with paper filing, but this is harder (for me) than it sounds. [more inside]
I'm trying to turn PDFs made from presentations on the Prezi website (prezi.com) into text documents and am looking for a program to OCR them with. Since Prezi's PDFs come out rather odd with some text trailing off the edge that isn't pertinent to the current slide I need something that will allow me to make a selection box around the text I wish to OCR as opposed to auto OCR'ing the entire page. What Windows program should I be looking at?
The full version of Adobe Acrobat has a way to OCR scanned images, so that the image is still viewed in the PDF, but you can search for text in the document. How do you do that without Acrobat? [more inside]
How can I get from paper to e-books for free on a Mac? [more inside]
Currently I teach A Level Psychology using the AQA A spec and want to change exam boards. I am thinking of Edxcel rather than OCR but wondered if anyone has opinions that they could share with me. Pros and cons for either and / or each if possible. [more inside]
How do I use OCR to scan a standard document into an excel spreadsheet? [more inside]
How do I use a highlighter? [more inside]
"Pen scanner" recommendations? Is it even called a pen scanner? [more inside]
I need to have about 20 novels scanned, OCRed and professionally proofed for conversion to ebooks. Destructive scanning is acceptable. Have you had recent experiences with a company that provides such a service? What did it cost, and how was their proofing? [more inside]
Suggest to me a fantastic OCR program capable of preserving formatting. [more inside]
Linux script to parse all files in file tree and submit to a program? [more inside]
I'm trying to go paperless, and have scanned and OCR'd huge stacks of paperwork into PDF documents. Can you recommend a tool to split, merge, delete pages etc from PDFs? [more inside]
I'm looking for a handwriting OCR program for OS X. [not evernote] [more inside]
How can I learn how to programmatically recognized Japanese handwriting? [more inside]
Any recommendations for scanning / PDF management software for my Mac that is cheap or, better yet, free? [more inside]
Image Data acquisition.... Is there software out there that can take an image and translate it to data? Kinda like a specialized OCR software. [more inside]
I have 60 to 100 magazines. I want to scan them all completely and then use OCR software so that they would be searchable. I am looking for recommendations on the scanner I should use, as well as the OCR software. I have a flatbed scanner, so with good software, I could probably take the time to do it, but a scanner that would let me feed in several pages at a time would be best. WinXP is the OS of my computer. Ideally, I would want to spend no more then $300, with $500 as an absolute max.
What document imaging company do you recommend that is user-friendly, cheap, and secure? [more inside]
Options besides PDF for digitizing dual-language books? [more inside]
At some point in the future, I'd like to follow the recommendations of some various sites I've seen online and scan most of my paper archives to PDF. It looks like this would be the best solution (I'm a Mac user) – or, at least, that's the device I've seen recommended a few million times. However, I really can't see having the discretionary $417+ to purchase this device, not for a very, very long time. Does anyone offer this device for rental? (I live in Chicago.) Is there a RipDigital equivalent for this kind of thing (a very long time ago, they did the initial move of my music from CDs to MP3s)? Are there cheaper alternatives that are just as good?
My (really not great) handwriting into pretty, pretty computer text? Is Livescribe the answer? [more inside]
ATM check deposits. OCR or just instantaneous offshore data entry? TIA.
Where can I find recommendations for document scanner to handle what I'd consider "medium - large volume" scanning of our office files? [more inside]
How can I convince a document management vendor to stop embracing 100 DPI / JPG as a universal format for scanned documents? [more inside]
My husband has been wondering about getting an OCR scanner pen. There have been a few questions before on this topic, but the technology may have moved on, and he has some specific needs. [more inside]
I have 18 copies of a 47-page document, scanned with handwriting on them. I want to extract the handwritten bits (i.e. compare, page-by-page, and eliminate the "constant" part), despite skewing, offset, and some noise in some copies. I want to use Perl or Python with e.g. ImageMagick or gd or something. Any pointers? I'm not talking about OCR -- just comparison, with one output being the graphical bits that don't match. [more inside]
Page: 1 2