Converting large HTML files to PDF or Word
November 30, 2019 3:19 PM   Subscribe

I'm having trouble converting a large HTML file (around 35MB) to PDF or Word. I have a Mac with 4GB of RAM. I've tried wkhtmltopdf but it didn't work.

The 'print to PDF' function in Chrome/Opera either crashes or outputs 1 page. I believe the convert to PDF function crashes my version of Adobe. I read somewhere that converting such a large file would require about 20GB of RAM.

Any suggestions? Thanks in advance.
posted by kinoeye to Technology (5 answers total) 2 users marked this as a favorite
 
Why is this file so big? Does it just have an enormous amount of text content? If so, can you convert to PDF if you manually delete most of that text first? If that works, and you don't need a completely faithful representation of the HTML, you could just split up the page one section at a time, convert them to PDFs, then combine them all at the end.
posted by J.K. Seazer at 3:39 PM on November 30, 2019 [2 favorites]


Pandoc should be able to do both: pdf, doc
posted by mustardayonnaise at 3:43 PM on November 30, 2019 [4 favorites]


Weasyprint (python) can be good for this
posted by scruss at 8:29 PM on November 30, 2019


Are you trying to print the actual HTML (i.e. the page source code)? Or, are you trying to save in a PDF the visual appearance of the webpage as seen on-screen?

If it's just the code, you should just be able to right-click > View Page Source then Save Page As... and select Format: Web Page, HTML Only. Then open that file in Preview and save it as a PDF.
posted by Thorzdad at 6:55 AM on December 1, 2019


I routinely print 100+ page websites as PDFs using the Safari browser. Give that a try first?
posted by soylent00FF00 at 2:56 PM on December 1, 2019


« Older Formatting automatically numbered chapter headings...   |   Can you explain this Babylonian math problem? Newer »
This thread is closed to new comments.