What is a good way to convert complicated HTML to image or PDF?
June 1, 2004 6:55 AM   Subscribe

I'm looking for a way to convert some fairly visually complicated HTML to image or PDF. I've tried HTMLDOC, pdfcreator and Adobe Acrobat Pro but they all have HTML rendering glitches that are hard to troubleshoot. I'd love to be able to use the Mozilla renderer for this. Any ideas?
posted by frenetic to Computers & Internet (17 answers total)
 
If you want to convert it straight to a flat image format, just grab the screen, then stitch it back together in photoshop.
posted by cheaily at 7:01 AM on June 1, 2004


Response by poster: Guess I should have said I want to do a huge number of conversions in the post.
posted by frenetic at 7:03 AM on June 1, 2004


There are some programmatic solutions like iText or FOP.
posted by nims at 7:28 AM on June 1, 2004


they all have HTML rendering glitches that are hard to troubleshoot

This just seems to be true in general. Think of the number of rendering bugs in our actual browsers and this starts to make sense.

I wonder, though, if you could potentially use a Moz-browser and either AppleScript or Windows Scripting to do a print-to-PDF kind of thing.
posted by namespan at 7:54 AM on June 1, 2004


You might try OpenOffice. Not sure if it would do a better job or not, but it can import from HTML and export to PDF. Also, Adobe Acrobat comes with a printer driver (not sure exactly what it's called) that should work better than rendering the HTML. Use the PDF printer when you print the file in Mozilla and it should look exactly like it does when you print to a regular printer.
posted by estey at 7:55 AM on June 1, 2004


Wait, if you have acrobat pro installed can't you open the page in Mozilla/Firefox/Whatever go to the print setup and turn off the header and footer so it doesn't display the url at the top of every page, and then print directly to DISTILLER? This is what I always do and I've never seen any glitches.
posted by Grod at 8:05 AM on June 1, 2004


Ditto the faux-printer route. That seems to work really well for the stuff I do.
posted by bonehead at 8:07 AM on June 1, 2004


Try PDF995. It's a faux-printer solution that's worked rather well for me to convert both HTML and MS Word docs to PDF. It's free. One tip (probably true for PDF creators everywhere): in the advanced printer settings, have the fonts set to "download as soft font" to avoid having them replaced with native PDF fonts.
posted by ewagoner at 8:15 AM on June 1, 2004


The free version of cutePDF is another printer way of doing it, that I find superior to PDF995

Al
posted by ajbattrick at 8:32 AM on June 1, 2004


OpenOffice is definitely worth a shot if you have a broadband connection.
posted by yerfatma at 8:37 AM on June 1, 2004


Response by poster: Well I tried the printer suggestion with Adobe Acrobat (turns out pdfcreator and acrobat's printer devices don't like to co-exist) and the printing out of both IE and Mozilla was far worse than using the internal Acrobat webpage conversion.

I'll give these other PDF converters a shot but I'm not holding out much hope. This doesn't seem like it should be so hard what with a good open source renderer kicking around, and there seems to be a large demand for it from what I've seen while searching around.
posted by frenetic at 8:46 AM on June 1, 2004


Print it to a Postscript file, and then zap it into a PDF with Ghostscript (using a GUI like FreePDF). Works every time.
posted by dayvin at 9:17 AM on June 1, 2004


MacOS X can print to PDF from any application, you should check it out.
posted by golo at 11:48 AM on June 1, 2004


seconding galo. Works perfectly, I use it all the time.
posted by John Kenneth Fisher at 1:12 PM on June 1, 2004


Response by poster: I wouldn't say that PDF is perfect, there seems to be no background on the main page, the header or the google ad, the form buttons are messed up. The fonts seem off too, but I have no idea what ask.mefi's fonts look like on a Mac.

cutePDF didn't work very well either, what a drag.
posted by frenetic at 2:07 PM on June 1, 2004


There is a "print background" option that I didn't enable. Obviously it's not perfect, the buttons are horrible (But it's pretty neat that it's there, systemwide).
posted by golo at 5:08 PM on June 1, 2004


seconding dayvin. if you dump to an EPS file, you ought to be able to open that file directly in acrobat or pull it up and convert with distiller. should fix your display glitches.

make sure the print settings of the page aren't the issue - if the page uses print-specific CSS it might be part of the problem (what you see on screen isn't what you get if printed, and so on)...
posted by caution live frogs at 5:23 AM on June 2, 2004


« Older Tell me everything about going to the gym   |   Websites for PC news Newer »
This thread is closed to new comments.