Join 3,377 readers in helping fund MetaFilter (Hide)

Looking for examples of web sites that can generate polished pdfs.
January 22, 2014 12:46 PM   Subscribe

I’m looking for examples of web sites that generate fairly polished pdfs when a visitor wants to print out a page of the site. Does such a thing exist?

We produce a media kit every year, a catalog of advertising and promotional opportunities available with our organization. We used to have it offset printed, then put pdfs of individual pages online. But with the number of changes and updates we were doing, it was often out of date by the time it came back from the printer. Now, we just create pdfs in Indesign, and update those constantly. It’s tiresome.

It seems like it would be much easier to present this info on a web site, since it needs updating so frequently, and have the visitor print it out from there as needed; however, our advertising manager is convinced that her audience (mostly agency media buyers) needs to have a good looking, well-designed piece of paper to bring to meetings with clients -- and printouts of web pages often look terrible (our site looks especially bad printed).

So is there a way to have a web page output a clean, well-designed pdf? Ideally, it’d be great if the visitor could select the items she’s interested in, then have just the selected items assembled on the fly into some sort of print-ready page or template, with some standard boilerplate added to each page. Doesn’t have to be a pdf -- the goal is to output a good-looking page from web site content.

Whatever we propose will have to be fairly idiot proof; screen shots won’t cut it. Is it a matter of adding some code to the site, or some other hack?
I could’ve sworn I’ve seen something like this before, but I’m having no luck finding examples, and starting to think I imagined the whole idea.

Any tips, comments, etc are appreciated.
posted by Bron to Computers & Internet (12 answers total) 10 users marked this as a favorite
I used to work on software that was intended to allow authors to work in a single format and output to both PDF and HTML formats and getting anything but a very bare template to work well in both of them is a nightmare (the software was eventually discontinued). There are a few reasons for this:

1) There are not actually very many people with PDF expertise. Lots and lots of software people can write HTML. Almost none of them car write postscript. Most of those people seem to work at Apple and Adobe. Thus, unless you're using Apple or Adobe's tools, there's not really an easy way to convert something to PDF. Yes, I know there are exceptions to this, but not very many, especially when compared to HTML tools.

2) This is the biggest one: Pagination. A web page can be any width, and any height. Printed pages are all A4 or 8.5x11" or what have you. There does not exist an algorithm that can decide things like where to break pages and which page to put an image on such that it "looks well designed". You can come up with an algorithm that just keeps dumping stuff onto the page until it's full, and then start the next page, but that doesn't always look great. You can end up with page two of a two-page document having a single line on it, when if you were designing it specifically to print, you would have reduced the leading a tiny bit to avoid that.

It's feasible that someone has come up with better software in the last couple of years that helps with this, and I'm just not aware of it, but I'd be really doubtful that there's anything that can produce PDFs that actually look designed from HTML input. This is less true for something like a book format where the content is almost entirely text, but the more graphs, figures, images, headlines, etc that you include, the more likely that automatic layout algorithms aren't going to do a very good job.
posted by tylerkaraszewski at 12:55 PM on January 22 [1 favorite]

Smashing Magazine has an article on How To Set Up A Print Style Sheet. While it's not a PDF, CSS media types will get you pretty far, including setting print-specific page breaks and layout.

The article also has a list of print optimized sites at the very end.
posted by Nonsteroidal Anti-Inflammatory Drug at 1:00 PM on January 22 [2 favorites]

I came in to say tylerkaraszewski's #2. When you're putting things into different formats you end up re-formatting and often even re-writing them a bit, and that's that. There is just no way around it. The medium really is the message, to a greater degree than we usually think in day-to-day life.

The fancier your formatting etc, the more this is true. If you just have a completely plain text file with a few titles and headings, that's pretty easy to transfer between a few dozen different formats. As soon as you introduce graphics, photos, call-outs, etc etc etc it becomes a matter of hiring someone to re-edit, re-format, and lay it out again for you in a way that actually looks good in your new medium.

In moving from one format to another, there is a lot of re-use of content but not so much of formatting and layout.
posted by flug at 1:03 PM on January 22

Why doesn't someone talk to members of the "audience" themselves to see if they agree that they need something up-to-date and printed as a polished item for these presentations? It could be that it would be satisfactory to have reasonably up to date printed versions with a reminder that the product is updated on the web site itself. The concept will be familiar to everyone.

Additionally, I would bet that many agency media buyers use, or would be quite receptive to the alternative idea of using, a tablet which will allow for quick access to the web site in real time.
posted by yclipse at 1:33 PM on January 22

Not sure if this is what you mean, but I just use PDF995. It is free. You use it like a printer, so the look will be as if you'd printed a hard copy, but instead of printing a hard copy it will print a PDF which you save to your desktop. It won't reformat anything though.
posted by St. Peepsburg at 1:48 PM on January 22

Archive of Our Own allows you to download any work as a pdf, or as many other digital formats. It is an open source project that is kind of perpetually in beta, but still the best thing out there in terms of fanfic archives. I don't know how the pdf download handles images or formatting, as I've never investigated that. But they are simple to contact and fairly communicative.
posted by Mizu at 2:38 PM on January 22

Thanks, everyone -- this is very helpful. Guess I was imagining the whole thing. I'm going to pass this thread on to the team and try and figure out our next move. (I'm really intrigued with the print style sheet article, thanks for posting that, NAID. But I don't know how much work that would be to get it working across the whole project. It's kinda looking like the labor would just get shifted from the print designers to the web designers, with the client maybe not as satisfied in the end.)

Why doesn't someone talk to members of the "audience" themselves to see if they agree that they need something up-to-date and printed as a polished item for these presentations?

I know, right? It would probably be much cheaper to buy each of the media buyers a new iPad and work toward making the site look good onscreen, compared to the amount of work that goes into this thing. It's politics, and the Way Things Are Done Around Here.
posted by Bron at 5:34 PM on January 22

If I was going to attempt this, I would probably start by dynamically generating LaTeX source, then pdflatex-ing it into a pdf. I'm not sure how well or even if this would work in real-time, server side on demand.
posted by ctmf at 8:53 PM on January 22

We're using hiQPDF in some web-based .net software we're designing to control pdf print-out of dynamically generated marketing materials that contain google graphs and ledger tables. Integrating it with the dev environment and getting the webpage ratios set correctly was a little cumbersome at first but it's working well for our needs. Not sure if this is the sort of thing you had in mind.
posted by stagewhisper at 10:53 PM on January 22

We use Atlassian Confluence for our support documentation - it's not exactly what it was designed for, but it lets us collaboratively work on documentation like a wiki, have it be accessible to customers, and it'll happily generate PDFs for download/print. The PDFs it generates aren't 100% perfect, and you're limited to their kind of formatting, but it might be worth looking into.
posted by spielzebub at 8:40 AM on January 23

CSS only gets you so far unfortunately-- for the times that I've had to do something similar that couldn't easily be handled by CSS, I've ended up using ReportLab which is a Python library that allows you to build a PDF.

If you want to see something similar in action, just go to WikiPedia and choose the export to PDF/Create book feature-- that uses the MWLib Python library, you can read more on that on the ReportLab site.

It all comes down to can you create a template that works consistently for the data you're feeding it.
posted by Static Vagabond at 9:19 AM on January 23 [1 favorite]

I've marked the two answers that seem to be gaining some traction on this project; unfortunately, looks like we're going to miss the deadline, so it's back to Indesign pdfs for this cycle. I have high hopes for next time, though. Thanks for the input, everyone.
posted by Bron at 6:07 PM on February 10

« Older I have been getting many spam ...   |  Web developers: My university ... Newer »

You are not logged in, either login or create an account to post comments