Convert Anything to PDF
June 29, 2011 11:29 AM   Subscribe

PDF-filter: I'm running a website that allows users to upload files. I'm looking for a converter that would allow User B to view a files uploaded by User A, even if User B doesn't have the same software used to develop the file. My guess is converting everything (that's not a simple image) to PDF would work, but my Google-fu is failing me and I can't find a PHP script or Apache module that will do this.

The classic example would be that User A uploads a WordPerfect file but User B doesn't have WordPerfect installed, yet I need User B to be able to view the file. I've been experimenting with Scribd, but it's a Flash system so won't work on iOS devices (and I'm not a fan of the interface).

A more realistic example for me is that User A will upload an AutoCAD blue-printy type file and User B won't be able to view it at all, even if User B downloads it instead of tries to view it online.
posted by GatorDavid to Computers & Internet (5 answers total)
 
My guess is converting everything (that's not a simple image) to PDF would work, but my Google-fu is failing me and I can't find a PHP script or Apache module that will do this.

What do you mean by "everything"? There's not a general case solution for converting binary data from application A to a PDF. You need a program to render WordPerfect to PDF, AutoCAD to PDF, etc. Even these are kind of underdefined: PDF is a page description language: you set up a page in terms of its dimensions, etc. and then draw things like text and lines to it. I don't know anything about WordPerfect but my guess is that its binaries essentially store descriptions of the document along with formatting settings, but that WordPerfect actually relies on the WP application to render that document to the printer drivers (of which PDF would be one), and that it needs user input to do so (though you could probably get away with reasonable defaults most of the time). To reduce this to an absurd case, what if the user uploads an audio file? Or a filesystem image?

Long story short this is a way underspecced problem right now, and when you spec it more narrowly you are going to have to go on a case-by-case basis, filetype wise.
posted by jeb at 11:48 AM on June 29, 2011 [1 favorite]


These guys are doing what you seek to do.

Maybe you can leverage their technology. They might offer an API, or maybe a paid account would allow you to run your conversions through them. No idea if they allow such usage.
posted by chazlarson at 12:43 PM on June 29, 2011


Yeah what jeb said - what you are asking for is not trivial.

Another thing to consider is when the conversion will take place: If you do it when user A uploads the file, then user B will get the converted (presumably less-functional) version of the file, even if they already have the application required for the original file. If you keep several versions, you need a way to figure out which one user B wants, since you cannot in general know what programs user B has.

Conversion might also be a performance issue, since things like converting arbitrary Autocad files to PDF is bound to be pretty intensive computationally - what happens if several users request conversions at the same time?

Overall this sounds like a problem best solved on the user side - maybe a page with links to usefull free converters and a couple of strategically placed hints to the user (e.g. near your download links) would cover most of what you want to achieve.
posted by Dr Dracator at 1:07 PM on June 29, 2011


Response by poster: Okay, I can clarify this a bit. I wasn't asking to see non-viewable multimedia files (mp3 files or mov files, for example) and the users won't be uploading them either. If someone was to upload an mp3 file, I'd expect the "viewer" to simply say it's an unsupported format.

But surely there is some way to convert most common text and / or graphics files to PDF.

(The system is currently using Scribd and they're considering converting to use ZoHo. They appear to be Flash-based systems, which is a bummer because viewers won't be able to use iOS devices, and both of these services require that the document get uploaded to their servers.)
posted by GatorDavid at 2:44 PM on July 29, 2011


Here's another outfit that seems to have such a thing available for one to run on one's own server.

ImageMagick [free software] will convert just about any image format to PDF.

The "text" formats [Word, WordPerfect] will be trickier. The easiest ways to do that are the myriad "Print to PDF" virtual printer drivers around. Lacking something like that, you need software that can read whatever formats you want to support and then write a PDF. Coming up with the first half of that is most of the difficulty.

Google's showing no shortage of commercial products that convert [bunch of document formats] to PDF, and the specific one you might want to use will depend on how you're deploying it [web-accessible, Windows-based web server, etc.] and probably how much you want to spend on it.

The simplest way to do this would be to have a person print every document to PDF using a free PDF print driver as needed, but of course there are scaling issues there ;).
posted by chazlarson at 12:38 PM on August 1, 2011


« Older I just want to sleep in five days a week.   |   Help Me Help A Sister Out Newer »
This thread is closed to new comments.