Batch converting PDFs, how to?
March 21, 2012 10:47 AM   Subscribe

How does one convert hundreds of multiple page PDFS (of varying page sizes per PDF and varying total page count), so that only a single jpeg is saved of the first page of each PDF, with the original file name intact. I have access to Photoshop CS 5 (not 5.5) and IrfanView 4.
posted by Brandon Blatcher to Computers & Internet (16 answers total) 3 users marked this as a favorite
 
ImageMagick is your friend. convert filename.pdf[0] filename.jpg will convert the first page of filename.pdf to a jpeg named filename.jpg.

At that point it's about your favorite scripting environment. If you're on a Mac or have access to a Linux box, after installing ImageMagick you can pull up a terminal, CD into the appropriate directory, and type:

for a in *.pdf ; do convert "$a[0]" "`basename \"$a\" .pdf`.jpg" ; done

On Windows you could install Cygnus tools, or figure out how to do this in one of the other command shells there.
posted by straw at 10:59 AM on March 21, 2012 [3 favorites]


I think Irfanview does this when converting PDF to JPG. I know it annoyed the hell out of me that you couldn't convert multiple page PDFs to JPGs. It would only do the first page.
posted by sanka at 11:00 AM on March 21, 2012


Response by poster: ImageMagick does not look like my friend. Surely there's an easier way than writing a script by hand?
posted by Brandon Blatcher at 11:01 AM on March 21, 2012


Response by poster: When using Irfanview, I get repeated errors of "Can't open 'filename'.pdf"
posted by Brandon Blatcher at 11:04 AM on March 21, 2012


Do you have the full version of Acrobat? If so, you could install a jpeg printer driver (e.g. "paperless printer"), and set this as your default printer. In Acrobat, go to advanced -- document processing -- batch processing -- choose print 1st page of all documents.
posted by prenominal at 11:10 AM on March 21, 2012


I think straw's answer might be your best bet, as long as you don't care too much about image quality. It's a one line operation, barely worth the label of a "script", and the command

$ convert fileN.pdf[0] fileN.jpg

does exactly what you need for each pdf file. The default output quality might be problematic, though - if so, you'll need to poke at the commandline options, which may be unpleasant.
posted by RedOrGreen at 11:31 AM on March 21, 2012


Response by poster: Do you have the full version of Acrobat? If so, you could install a jpeg printer driver (e.g. "paperless printer"), and set this as your default printer. In Acrobat, go to advanced -- document processing -- batch processing -- choose print 1st page of all documents.

This is close. The only problem is that the Papless Printer wants to save each JPEG in a separate folder. This is unpleasant, anyway around that?
posted by Brandon Blatcher at 11:44 AM on March 21, 2012


Note that if you go the ImageMagick route (and, really, it is worth learning how to script stuff and use command-line tools, your computing life will change dramatically for the better) you can tweak the resolution of the conversions. For instance

convert -density 600x600 -geometry 612x792 filename.pdf[0] filename.jpg

will render out the PDF at 600DPI and resample the output to a 612x792 result.

Aaand, any editor worth its salt should allow you to playback and record macros. Even Microsoft Word.

So on DOS you could type "dir > dostuff.bat", open dostuff.bat in Word, and use Word's macro capability to change each line to have the conversion stuff in it, and then execute that batch file.
posted by straw at 12:03 PM on March 21, 2012


Response by poster: I do not wish to use ImageMagick, thank you. Please quit suggesting it.
posted by Brandon Blatcher at 12:07 PM on March 21, 2012


You can do this fairly easily in Photoshop (if you're familiar with Actions). You do it once (open the PDF in Photoshop, save as jpg) and record it, and then play the Action as part of a batch.
Here's a decent tutorial of the basics.
posted by FreezBoy at 12:09 PM on March 21, 2012


Response by poster: It does not seem to work in Photoshop. After recording the action, the program simply uses the name of the original saved JPG, over and over, saving each new PDF over the last saved PDF.
posted by Brandon Blatcher at 12:14 PM on March 21, 2012


Automator to the rescue!

Automator can do all sorts of things with PDFs, and its super easy to set up - just create a workflow that takes the first page off of a PDF and converts it to a JPEG.

If you're having trouble with this I'll try and write you a workflow for this, but this should be enough to get you started.
posted by modernserf at 12:40 PM on March 21, 2012


Response by poster: The files are in 100% Windows shop (XP, Vista and 7), comprising about 16 gibs, so not easily moved. No, there isn't a removable drive available.
posted by Brandon Blatcher at 12:46 PM on March 21, 2012


To get Irfanview to open PDFs, you need plugins and Ghostscript installed.
posted by zsazsa at 12:52 PM on March 21, 2012


Best answer: Here's how to do it using Batch Actions in Photoshop CS 5

File>Scripts>Image Processor.

Select Input and output folders

Choose JPEG as filetype, with a quality setting. Since the files are of varying physical aspect ratios, don't chose a specific size.

Click Run.

This outputs large jpegs, at Photoshop's default resolution of 300 dpi. But it does do just the first page, does it quickly (about 50 files in roughly 3 minutes) and does not change the file name.

From this point, it's easy to create and run a Photoshop action to change the resolution of all the files.
posted by Brandon Blatcher at 1:01 PM on March 21, 2012 [1 favorite]


Response by poster: Figured it out myself, so marked my answer as Best. Thanks all!
posted by Brandon Blatcher at 1:22 PM on March 21, 2012


« Older Is there an entry level to hell?   |   If you're bored then you're boring. Newer »
This thread is closed to new comments.