Information extraction service - does it exist?
March 17, 2016 4:15 PM   Subscribe

We have a closet full of old paper forms at work. We'd like to convert them into a csv of 'name', 'email' to import into our email list. Is there a service that would do this for us? Or is there a certain google search term I should use?

There are a few ways I could see it working:

1. We mail boxes to someone, or they come to us and bring a scanner, they email us the data.
2. We scan the files, then use some tool to OCR and extract the name and email.


Obviously we can just pay someone to work their way through the closet and type stuff in, it just seems like this should be common enough that someone's specialized at it. We pay for leads now, so we're definitely willing to pay. Its fine if the paper copies are destroyed during digitization. I've found some services that call themselves digitization, but it seems like they only capture images of the documents and don't extract any information from them.
posted by hermanubis to Computers & Internet (5 answers total) 2 users marked this as a favorite
 
Data entry services. The "CSV" part might be hard -- excel might be easier -- but the term you want is data entry.
posted by flibbertigibbet at 4:21 PM on March 17, 2016 [1 favorite]


OCR still often requires checking for errors. I think you want data entry services -- whether you hire a company or a person.
posted by bluedaisy at 4:27 PM on March 17, 2016


Best answer: Generally, it's called a "service bureau", although that's pretty generic: A google search for "service bureau" document scanning gets some info. The new buzzword is BPO, but it's mostly the same thing. We still call ours a service bureau, but our competitors are all about the BPO. Tomayto, tomahto, for the most part, although BPO is more long-term relationship (they open and scan your mail 365 days a year) vs single project (which is what our service bureau does).

Generally, this is "data capture" as far as software to do it automatically; my employer sells the stuff so I'm quite familiar with how its done, but it's hard to advise without seeing the documents (and, well, I shouldn't be doing work for free that we normally charge for). We're an EMC house, so Documentum is enterprise-class, Captiva Capture is the workhorse, QuickScan Pro can handle simple systems.

If the files are uniform -- data always located in the same place, consistent fonts, etc. -- software can do amazing stuff with it. If it's not consistent, the service bureau will have somebody manually retyping everything.

Depending on the amount of data and documents, you're better off outsourcing the work; it is not an easy 'press a button and it works' system to set up, even if it can be fully automated.
posted by AzraelBrown at 4:33 PM on March 17, 2016 [2 favorites]


Best answer: Captricity offers a service like this that might be appropriate.
posted by migurski at 7:13 PM on March 17, 2016


Best answer: EQOD can do this for you, so can any company that scans in standardized test forms from school kids - Pearson, etc.
EQOD contact info:
Phone: 908-591-2560
Email: jacquie@eqod.com
www.eqod.com

as mentioned above, the process is different than OCR, they program a scanner to look for info in the same spot and then the info is captured in a field from the same location. EQOD has people off shore who can type in the info quickly and cheaply if capture won't work.
posted by linder6 at 9:38 AM on March 18, 2016


« Older Book identification: 80s-ish YA/teen novel about...   |   Can you identify the signature on this crystal... Newer »
This thread is closed to new comments.