Not Planning Ahead Sucks
April 1, 2005 7:48 PM Subscribe
I have 12 issues of an image intensive 'zine in PDF. I've slowly been converting them to XHTML by hand. I'm halfway through the 5th issue and I'm sick of it. Is there any easy, or less difficult, way to convert these?
For example, this (Intrapdf trial version), or this (Adobe online conversion).
posted by Boobus Tuber at 3:38 AM on April 2, 2005
posted by Boobus Tuber at 3:38 AM on April 2, 2005
Acrobat can export to HTML, though it is not valid code. (HTML Tidy can assist with that.)
Platform? Number of pages? Multicolumn layouts? More info, please. Example if possible.
posted by joeclark at 6:19 AM on April 2, 2005
Platform? Number of pages? Multicolumn layouts? More info, please. Example if possible.
posted by joeclark at 6:19 AM on April 2, 2005
Response by poster: Sorry about not including my system specifications. I'm running Mac OSX, though I do have access to an XP box.
Adobe's online conversion simply doesn't cut it. It rearranges the formatting and leaves the areas for the images out. I get 23 pages of text, essentially.
Acrobat's export to HTML has never really worked properly, either. It's not just that it needs tweaking, it basically mucks up the formatting to the point where tweaking it doesn't save any time.
Again, I have no problem going in and fixing up some things, but the output has to be usable.
The 'zines are about 25 pages.
Examples:
PDF version
HTML version
posted by Captaintripps at 7:38 AM on April 2, 2005
Adobe's online conversion simply doesn't cut it. It rearranges the formatting and leaves the areas for the images out. I get 23 pages of text, essentially.
Acrobat's export to HTML has never really worked properly, either. It's not just that it needs tweaking, it basically mucks up the formatting to the point where tweaking it doesn't save any time.
Again, I have no problem going in and fixing up some things, but the output has to be usable.
The 'zines are about 25 pages.
Examples:
PDF version
HTML version
posted by Captaintripps at 7:38 AM on April 2, 2005
Best answer: All right, looking at your sample PDF file I see that it was created in PageMaker 7. (How retro!) PM7 can produce a tagged PDF. Do another export with tagging turned on, and no, I don't know which little button to tick, but it won't be hard to find. (On checking more closely in Acrobat's HTML export settings, it converts to tagged PDF itself. Nonetheless, give 'er a whirl.)
In Acrobat, an export to HTML or even plain text (the choice that isn't marked "Accessible" tends to be cleaner) will now quite probably work better.
I tried HTML+CSS export and the results are not really that bad. Your use of pictures of text for headlines is a complication. You have to write alt texts for your images. Some search-and-replace in BBEdit (also use of Tidy) will improve things. Don't stop till you've got actual valid code.
Twelve issues are not that many issues. Don't try to do them all at once, but the job will gradually get done.
posted by joeclark at 3:00 PM on April 2, 2005
In Acrobat, an export to HTML or even plain text (the choice that isn't marked "Accessible" tends to be cleaner) will now quite probably work better.
I tried HTML+CSS export and the results are not really that bad. Your use of pictures of text for headlines is a complication. You have to write alt texts for your images. Some search-and-replace in BBEdit (also use of Tidy) will improve things. Don't stop till you've got actual valid code.
Twelve issues are not that many issues. Don't try to do them all at once, but the job will gradually get done.
posted by joeclark at 3:00 PM on April 2, 2005
This thread is closed to new comments.
posted by Boobus Tuber at 3:34 AM on April 2, 2005