How to Batch Process PDFs
November 5, 2012 4:31 AM   Subscribe

How can one batch modify pdfs that are from scanned books / magazines for easier reading on Kindle? Specifically a) minimize the margins. b) repaginate when they are 2 pages to a page scans.

I have some philosophy readings that are scanned as 2 book pages to a landscape A4 page. Is there any easy way to split these into single pages for reading on a kindle?

Also it would be handy to just have an easy (batch process) way of minimizing all the margins on PDFs for reading on the Kindle. Is there any way to do that?
posted by mary8nne to Technology (8 answers total) 10 users marked this as a favorite
 
Sure, there are apps that can do that.

If you use a Mac, the strong integration the operating system has with the PDF format is helpful; for example, the built-in Preview application can crop multiple pages in a PDF with very little fuss.

If you use Windows or Linux, take a look at Scan Tailor, which is used by the diy book scanning community to prep large numbers of pages.
posted by bcwinters at 4:56 AM on November 5, 2012 [4 favorites]


Have a look at briss or pdfscissors.
posted by pharm at 5:13 AM on November 5, 2012


Yes, Scan Tailor is pretty handy for this.
posted by Rykey at 5:56 AM on November 5, 2012


I use a combination of pdfscissors and k2pdfopt

Use pdfscissors to cut the pdf into columns, then k2pdfopt to format them nicely for the kindle.
posted by calm down at 7:51 AM on November 5, 2012 [1 favorite]


I use a script that uses a slight variation on the ghostscript example in this superuser.com answer.

The change i have is to use ;
pdftk \
B=right-sections.pdf \
A=left-sections.pdf shuffle A B output $DOC_INTERM

rather than

pdftk \
A=right-sections.pdf \
B=left-sections.pdf \
cat B1 A1 B2 A2 B3 A3 B4 A4 B5 A5 B6 A6 B7 A7 B8 A8 \
output $DOC_INTERM

Just saves having to know how many pages have been created.
posted by stuartmm at 8:08 AM on November 5, 2012


I use ScanTailor but my main problem is, the size of the resulting file. Usually these files are so big that even iPad 1st Gen can not open it so I wonder if a Kindle will open it.

Scan Tailor will do your job but you need to figure out the next step of reducing file size, printing through another pdf printer or file reduction using presets in Adobe Acrobat, I tried everything. If the file size reduces, so is the quality. So I am stuck there. I have a tons of books scanned and pdfed using Scan Tailor.
posted by zaxour at 7:04 AM on November 6, 2012


Response by poster: I did a book using pdfscissors and it seems to have worked quite well. (and its much the same size as the original pdf so thats fine.

I"ve read other pdfs on the Kindle and I find its ok with moderately hi-quality scans so seems ok.
posted by mary8nne at 8:04 AM on November 6, 2012


This is more of an answer for zaxour than the OP.

To reduce the size of a scanned PDF, use Adobe's Clearscan option available in their OCR engine. I also use Scantailer to process scanned books and feed the images to Acrobat Pro. Clearscan works amazingly well and can reduce a 80MB PDF into about a 7MB file without any obvious loss in image quality.
posted by volition at 1:25 PM on November 6, 2012


« Older Looking for a challenging board game that is short   |   locked out of twitter - how to get back in? Newer »
This thread is closed to new comments.