Selecting text from PDF in Preview
June 30, 2006 12:01 PM   Subscribe

Some PDFs in Preview (OS X) have quite a fuzzy text even at high zoom, and I can't select the text (which is the main problem). Is there some remedy for this?
posted by raheel to Computers & Internet (14 answers total)
 
Are you sure it's actual text, and not a scanned image of text?
posted by yeoz at 12:04 PM on June 30, 2006


Yeah, it's probably a scanning image. Can you like to an example?
posted by cillit bang at 12:13 PM on June 30, 2006


You can use OCR on the scans to get the text.
posted by signal at 12:50 PM on June 30, 2006


This would indicate the PDF is an image of text, rather than text (which to the computer is completely different).

Unless you have an OCR program installed that can try and turn the image into text, there's not much you can do about it.
posted by teece at 12:50 PM on June 30, 2006


If you are the one creating the pdfs, this is possibly related.
posted by gwint at 12:53 PM on June 30, 2006


Bypass the Preview program and use Acrobat directly?
posted by Juggermatt at 1:56 PM on June 30, 2006


Even Acrobat doesn't let me select the text (on any OS). One example would be this. This happens for a lot of Computer Science journal articles that I get.

So as teece said, there is nothing much I can do about it?
posted by raheel at 2:55 PM on June 30, 2006


Wow...I opened that pdf in Illustrator and discovered it's even worse. It's not an image of a page of text. Each individual letter on the page is an image itself. It's the most fucked-up thing I've seen in a pdf in a long time.
posted by Thorzdad at 3:26 PM on June 30, 2006


1975 embedded grayscale images on the first page alone. Geez.
posted by Thorzdad at 3:34 PM on June 30, 2006


Yeah, it's probably a scanning image. Can you like to an example?

What the hell?
posted by cillit bang at 4:31 PM on June 30, 2006


"like" = link, I believe
posted by raheel at 4:47 PM on June 30, 2006


That is a problem caused by TeX/LaTeX, a computer typesetter.

By default, LaTeX uses a bitmap font (i.e. image for each letter) that is preferenced above any vector fonts. And my god, it's ugly and stupid.

A description of the problem can be found here.
posted by easyasy3k at 5:43 PM on June 30, 2006


Aahh, I had a feeling it had to do with LaTeX! Oh well, guess I am stuck with this!!
Thanks all.
posted by raheel at 6:35 PM on June 30, 2006


If you use the -Ppdf option when running dvips on the latex-produced dvi file, it will use the outline fonts instead of the bitmap fonts, which will make the pdf look sharp.

Alternatively, you can use pdflatex to make the pdf directly from latex, but you might run into issues if you have postscript images in your latex file.

Alternatively, you can use dvipdfm to make the pdf instead of dvips.
posted by raf at 7:40 PM on June 30, 2006


« Older I won't shoot spitwads at Charece.   |   hair emergency! Newer »
This thread is closed to new comments.