PDF Handling in macOS Big Sur
July 21, 2021 4:39 PM   Subscribe

What is the best software to handle OCR and PDFs on MacOS? Finereader is NOT supported on Big Sur and though it runs a major glitch is that mouse drag control does not. What are my alternatives?

I am doing work with photos of text that are then OCR'd through Finereader which has WAY better OCR than Adobe and editing magic happens before I push it out in text. The method was rickety but worked well enough.

I just got an M1 iMac and moved to Big Sur. Well, ABBYY/Finereader are not supported via their support center.

What are my options or workflow regarding software to handle OCR?
posted by jadepearl to Computers & Internet (9 answers total)
Tesseract may be an option, if you're able to do some basic command line work. It works on Apple Silicon.
posted by They sucked his brains out! at 5:37 PM on July 21

Note that ABBYY FineReader PDF for Mac, does support macOS Big Sur, it's the older FineReader Pro for Mac that doesn't support Big Sur. I don't know why the new version is "FineReader PDF" and the old version is "FineReader Pro", but ABBYY says that ABBYY FineReader PDF for Mac is an upgrade that has more features than FineReader Pro for Mac.
posted by RichardP at 6:25 PM on July 21

Response by poster: @RichardP thanks for tracking down the new version. Uh, they seemed to leaped versions and title because Finereader Pro is 12.14 while Finereader PDF is version 15. My question still stands because taking a look at app store reviews which stand at one-star ABBYY/Finereader is in a suboptimal state. Hell of a gamble at $129 and no upgrade pricing path for Finereader Pro Mac people. Uh, I usually get like flowers or something with this kind of offer, you know?
posted by jadepearl at 6:39 PM on July 21 [1 favorite]

Tesseract can be a bit shit on tabular text. It does like photographs, though. It will be nowhere near as good as ABBYY.

If you're okay with command line, a great front-end for Tesseract is OCRmyPDF. It doesn't make Tesseract's OCR quality much better, but will multi-thread processes and run many times faster. It will take JPEGs direct;y and create an OCR'd PDF of them. It'll use all your computer's resources when it's running, but it's quick.
posted by scruss at 7:49 PM on July 21 [1 favorite]

PDFPen is a MacOS PDF manager that will do OCR pretty well.
posted by yclipse at 11:51 AM on July 22

The next version of macOS - Monterey, which will be out in a couple of months - has built in OCR. They call it Live Text. From the press release:

Live Text uses on-device machine learning to detect text in photos, including phone numbers, websites, addresses, and tracking numbers, so users can copy and paste, make a phone call, open a website, and easily find more information. Visual Lookup also uses machine learning to help users discover and learn about animals, art, landmarks, plants, and more in photos. These features work across macOS, including in apps like Photos, Messages, and Safari.

If you can get by until Monterey is released, it might be worth at least checking that out.
posted by mewsic at 12:05 AM on July 23

Could you run a virtual machine to expand your options? There are a lot of linux-based OCR solutions out there (tesseract, cuneiform, gocr, probably others I'm unaware of). Another option is cloud services like Google Cloud Vision API, though you'd need a programmer's help to leverage that, though in my experience it's pretty easy to use (so you wouldn't need a particularly skilled programmer).
posted by axiom at 9:52 PM on July 23

tesseract, cuneiform, gocr

Cuneiform hasn't seen an update in a decade, and gocr doesn't understand more than one column. Google cloud uses tesseract.
posted by scruss at 12:08 AM on July 24

Textsniper got mentioned on daringfireball. Might be worth taking a look at.
posted by They sucked his brains out! at 2:30 PM on July 27

« Older Can I cool jam and store it in the fridge before...   |   Please recommend me mystery novels with hard-luck... Newer »

You are not logged in, either login or create an account to post comments