How do I antialias scanned TIFF images?
April 2, 2007 8:50 AM   Subscribe

Scanning a book and converting to a PDF - how do I antialias the scanned page images?

I'm in the process of scanning a long-out-of-print book and making it into a PDF.

I've scanned the pages as 300dpi 1-bit B&W TIFF files. When viewed in Preview or Acrobat Reader (which have "antialias" features), they look great. When I look at them with Foxit Reader or xPDF, the pages look rough and "jaggy".

My question - is there a software filter or effect that I can apply to the TIFF files (even if I have to make the result 4 or 8-bit grayscale instead of B&W) that will give me the same effect without having to depend on the PDF viewer application?

Some sample images:

Introduction (TIFF | GIF)
Page 91 (TIFF | GIF)

I've looked at ImageMagick, but it apparently will only antialias images when converting PostScript or vector image formats into bitmaps.

(The reason for my efforts is that there is still a demand for the book - originally priced $15.95 in 1981, used copies now go for anywhere from $40 to $400. This is being done with the full permission of the author/copyright holder, who will make the book available for download on his website.)
posted by mrbill to Computers & Internet (8 answers total)
 
Best answer: The difficulty is in scaling for display—if you want to display scaled-down images with nice anti-aliasing, you're married to a viewer that supports that as a display-time feature, period.

What you can do is batch-process your source images to create scaled-down-with-antialiasing display images, and there are plenty of applications that can do that for you for $nil-to-free. However, that will only work if you can be satisfied with a specific set of fixed resolutions (e.g. 400px wide, 600px wide, etc) to use as your display format.
posted by cortex at 9:08 AM on April 2, 2007


Response by poster: It appears that doing a very slight "blur" filter in Photoshop to one of the TIFF images (after converting to grayscale) does something similar to what I'm looking for. Unfortunately, of course this increases the file size by almost 10x, and would have to be automated (with ~300 pages).

Cortex, if you've got suggestions for the scaling-with-antialiasing applications please let me know what you have in mind.
posted by mrbill at 9:19 AM on April 2, 2007


Oh, you have Photoshop? Just use a size ratio of 2 when you convert to grayscale. You'll end up with 150 dpi anti-aliased files.

And you can automate anything in Photoshop. Look at the batch options on the File menu.
posted by smackfu at 9:25 AM on April 2, 2007


The beloved-and-maligned GIMP has long been my standby. Free, easy install, and the interface is non-bizarre these days. Maybe be a little heavier than your specific needs, but it'll work.

I got a free copy of Photoshop Elements recently with a hardware purchase, and it's been a dream for simple batch jobs. Don't know what it retails at.

Having wrestled with ImageMagick more than once, I'll say that it's never been painless but I've always gotten it to do what I wanted.

But it sounds like you have access to Photoshop itself already? In which case a batch-process applying either the slight-blur thing to the at-size images, like you reference, or just doing a resize/resample with bilinear filtering or similar would work well.

Is file size an issue? Reducing the resolution to 150-100 dpi will probably leave you with a document perfectly usable for screen-reading, though if your intent is to let people actually print this thing they may be dissatisfied with the output on paper.
posted by cortex at 9:27 AM on April 2, 2007


Best answer: You have *really* nice scans. I'm going to suggest you find a copy of an OCR program like ABBY FineReader, which does an absolutely incredible job with images scanned as well as these. It will even preserve paragraph flow around images.

That will solve your scaling problem, and greatly reduce filesizes. It will also make it possible to convert it to all kinds of other formats.

If you want to talk about this software more, please email me, email is on my website which is in my profile.
posted by fake at 9:44 AM on April 2, 2007


Just to prove that OCR is going to do a great job with these scans, I ran your scan through SimpleOCR, a web-based service. While this in no way competes with FineReader, which will preserve your page layout appearance, it should give you an idea of the accuracy available on even bottom-end, free services.
posted by fake at 9:49 AM on April 2, 2007


Response by poster: Fake, this book has a huge amount of images and diagrams but I'll check out FineReader - it looks like they have a 15-day free trial which would be adequate for this project.
posted by mrbill at 9:54 AM on April 2, 2007


Best answer: Using GIMP, here is what I ended up with (picture will be up for a day or two, but probably not forever)...anyway, you'll note that the file size is 69.3 KB to the original 56.4 KB--my experiments suggest that this is the lowest you're going to get going from large-but-two-bit images to small-but-fuzzy (for example, when I just blurred and scaled down, the image size was, as you suggest, quite a bit larger).

Process:
1 px horizontal, 2 px vertical Gaussian blur
scale image: 11 inches across (adjust as you like), 72 dpi, cubic interpolation (this is important)
reduce colour depth to a 16 bit indexed "optimum" palette
save as PNG, maximum compression

This process should be easy to automate, although you'll probably prefer to do so using PhotoShop so I'll spare you the details.
posted by anaelith at 9:54 AM on April 2, 2007


« Older Your favorite stats & graph tools   |   Tour or no tour? Newer »
This thread is closed to new comments.