Recommendations for html/pdf creation
April 21, 2004 11:35 AM   Subscribe

i want to generate documentation for a project in both html and ps/pdf. the content is static and i need to generate previews on both linux and windows, off-line. i'm aware of (but haven't used, mostly) anakia, forrest, docbook, structured text, and latex2html. i don't expect to have any complex mathematical formulae, but may include some logic/category theory, so need some support for non-ascii text. code fragments will be common and i want good control over the final presentation. any experience/recommendations? oh, and the internal representation has to play nicely with cvs (ie no binary format). thanks.
posted by andrew cooke to Technology (16 answers total)
There's also ReStructured Text.
posted by kenko at 11:51 AM on April 21, 2004

I have been using ReStructured text with a proprietary XSL:FO tranformation that renders PDF directly from DocUtils XML output.

This system is, and I am underemphasizing this, kicking fucking ass.

I say that having spent years writing technical documentation using MSWord, then Ventura Publisher (god, that's a nice bit of software!), then futzing with XML editors.

RST has satisfied almost all my needs, and where it doesn't, I have been able to adapt my needs.

FWIW, I'm available for hire.
posted by five fresh fish at 11:57 AM on April 21, 2004

Response by poster: so is the docutils guy, by the looks of things. the pdf support (that's available for free) looks a little under-supported. is that the only (free) way to get pdf from rst? thanks.
posted by andrew cooke at 12:07 PM on April 21, 2004

XSLT is the standards-based tool that was basically designed for this kind of problem. It lets you take an XML source document, apply a particular stylesheet to it, and get a different XML (or not XML) document as output. The definitive example of its use is to take an XML document representing, say, a magazine article and transform it into an HTML page for that magazine's website (so that you can change the look of the whole website by changing one file. Its similar to CSS, but much more powerful).

I don't use the XSL-FO side of it, but XSLT does kick ass. It will give you infinite control over final presentation and is XML based, so it supports unicode (so you're not limited to ASCII).

Its not hard to learn either, despite what people who haven't tried it claim. It can work like CSS if you want (with certain tags getting a certain kind of formatting wherever they appear) or like a templating engine (it'll take a premade documentation template and just fill in the variables) or both at the same time.

I'm not familiar with the PDF format, and while I know that PDF output can be done (and is done), I'm pretty sure that generating PDF output will require familiarity with the actual PDF format, which may be more of a hassle than you're willing to go through.
posted by gsteff at 12:25 PM on April 21, 2004

Response by poster: i've used xsl, but since i want output in a language that's neither simple text nor sgml-like, i need a dedicated converter (pdf is based on postscript, which is a stack-based (forth-like) language that describes graphics). since the converter for xml->pdf will specify some format for xml it makes sense to look for a package that already supports both pdf and html, so that they can share the same xml source without the need for further transformations.

i understand the theory, i just want an implementation. and i don't really care if the base format is xml or something else, as long as it gets the job done.
posted by andrew cooke at 12:37 PM on April 21, 2004

If you happen to like the idea of using LaTeX as a source language, then you should have a look at HeVeA before immediately jumping to latex2html. The latter is more powerful, but will also produce a more complicated and possibly cluttered document. HeVeA is very fast and generally produces pretty nice (and straightforward) output. It does not support conversion of math to images, but does have some support for non-ascii characters.

I wouldn't really recommend either latex2html or HeVeA for this sort of thing, as LaTeX lacks some of the structure necessary to do really good conversion to multiple output formats. Nonetheless, I've done it myself simply because I know LaTeX and have little desire to learn Yet Another Markup Language.
posted by Galvatron at 12:46 PM on April 21, 2004

Here's an example of some nice HeVeA output, where the author adds a stylesheet to control the presentation.
posted by Galvatron at 1:01 PM on April 21, 2004

David Goodger is available for hire for programming. I'm available for hire for writing/editing. But that's all probably moot.

If you look in the /dpriest directory you'll find the beginnings of my XSL:FO tranformation file. As soon as I can figure out the SourceForge CVS again (it keeps wanting to interfere with my real-life CVS), I'll be updating those files. I'm sure you'll be able to adapt them to your needs.

The previous PDF generators are all shite. Mine takes DocUtil's raw xml and transforms it using XSL:FO, which is about as good as you're going to get. (The others tried to do PDF through other means.)

I'm also beginning to toy with the idea of an XSL:FO transformation to create Microsoft CHM help files. I haven't coded even line one of that project, though.

And, finally, I've begun to create a cooperative collaborative RST-viewing/editing system. It'll be many, many months before I have even a beta for that, given how little I can work on it.
posted by five fresh fish at 2:13 PM on April 21, 2004

Response by poster: thanks. i was hoping for something simpler, but it looks like that (via XSL FO) or latex/hevea (which would be cool, since i'd be using a program written in one ML dialect to document a program written in another :o). cheers everyone.
posted by andrew cooke at 3:37 PM on April 21, 2004

RST will be appropriate if your math/symbol typesetting requires only inline unicode characters. If you're getting into fancy things-over-things layout (fractions, calculus, etc) then LaTeX is a far better choice. Though someone's currently working on a way to insert LaTeX math layout into RST text streams...
posted by five fresh fish at 4:06 PM on April 21, 2004

BTW, I'd like to know what your solution ends up being. Please look up my user profile and mail me when you've got it all hashed out. Thanks!
posted by five fresh fish at 4:06 PM on April 21, 2004

fff: the /dpriest directory of ... what? Sorry if I'm being obtuse.
posted by kenko at 5:30 PM on April 21, 2004

Sorry, I was being obtuse. This directory (two up and one across from the one Andrew identified earlier.)

Shucks, now I'm gonna feel all pressured to actually figure out SourceForge CVS once again...
posted by five fresh fish at 9:48 PM on April 21, 2004

have you checked out DITA?
posted by snowgoon at 4:49 AM on April 22, 2004

Response by poster: kenko - i just googled for dpriest and xsl ;o)
fff - will do, but it may be a month or so before this gets sorted out. cvs should "just work" if you've already got CVS directories present, i believe (i think they define CVSROOT in the local tree - I use more than one CVS source and it's been so problem-free that i have no clear idea how it works).

dita looks interesting, but seems to define only a standard for the data format, not provide tools for the actual generation of documents (but i'd love to be proved wrong!)
posted by andrew cooke at 6:46 AM on April 22, 2004

Response by poster: i said i would report back. for now i'm going with almost free text because it's (very) simple and looks ok by default. when i get more time i'll use latex and hevea (i can get from aft to latex, so am not losing anything).
posted by andrew cooke at 11:35 PM on May 13, 2004

« Older Eye surgery   |   Looking for Sequoia messenger-type bag information Newer »
This thread is closed to new comments.