What's the best tool to write a code-heavy ebook?
July 5, 2013 5:06 AM   Subscribe

I'm writing a programming ebook (on introductory programming for bench biologists, if anyone's interested) and am trying to figure out the best tool chain. Ideally, I'd like to write the content in something like markdown or restructuredText, then be able to easily render it to PDF (and maybe eventually epub/mobi). The book will contain a lot of source code and program output, so whatever solution I pick needs to be cope well with chunks of differently-formatted text. Any suggestions?
posted by primer_dimer to Computers & Internet (18 answers total) 6 users marked this as a favorite
LaTeX to PDF directly with pdflatex; LaTeX through tex4ht to html through calibre to mobi or epub?
posted by ROU_Xenophobe at 5:16 AM on July 5, 2013

posted by jferg at 5:16 AM on July 5, 2013

I'm using Sphinx for a very similar task: accepts restructured text, exports to everything. Lots of plugins.
posted by outlier at 5:17 AM on July 5, 2013

Nthing LaTeX. Bit of a learning curve, but it's one more thing to put in your book!
posted by supercres at 5:28 AM on July 5, 2013

Definitely LaTeX, especially if you want to include any equations.
posted by quaking fajita at 5:34 AM on July 5, 2013

And when you're ready to make an EPub, Pandoc. Of course, that might do a lot of the heavy lifting even if you don't use LaTeX (but you should).
posted by supercres at 5:38 AM on July 5, 2013

I wrote a 200 pages math textbook using LaTeX. I would not hesitate to use LaTeX again to do the same. However, I would second the Sphinx suggestion given what you are asking. Sphinx can also be used to produce very nice looking web versions. And, if the code used is Python, you can have the book become an interactive website using Crunchy.
posted by aroberge at 5:43 AM on July 5, 2013 [2 favorites]

Here's the thing, there's no easy tool-chain here. If you want an excellent-looking professional quality pdf then you're not going to be able to translate that directly and easily to an ebook format.

The less you care about how the pdf looks the easier the translation will be.

So, nthing LaTeX (specifically LuaTeX so you can easily use your own fonts, microtype, and selnolig).

And then copy the text over to whatever ebook program you want and reformat it again.

I don't have a ton of experience doing this but based on what I have done I found that you pretty much have to start over in each format if you want it to look its best.
posted by bfootdav at 5:48 AM on July 5, 2013

I watched this really great talk a couple of days ago where a professor of chemical engineering used Emacs + org-mode + python to write code intensive papers and books: Emacs + org-mode + python in reproducible research
posted by toddje at 5:50 AM on July 5, 2013 [1 favorite]

Do test pandoc on a small section first before relying on it. There are some things it can't convert in all directions. (For instance, I believe it can't do HTML tables to LaTeX tables. You just get a pile of text instead.)
posted by hoyland at 6:23 AM on July 5, 2013

Just did this for a textbook I helped publish.

The option that most directly answers your question (re: Markdown) is to use Pandoc. You can write your book in Markdown and use Pandoc to generate the ebooks directly. Pandoc's help page on generating ebooks (here) shows an example of building an ebook using Scott Chacon's book Pro Git, which can be built from his source code on GitHub. He wrote his book in Markdown. This is probably the simplest option for you.

The route I went was a bit more complicated. Because we were publishing the book as a fully-formatted PDF that could be physically printed down the road, we used LaTeX for the master document. To setup the ebook, I used a free/OSS ePub editor, Sigil. Sigil makes it super easy to create an epub from HTML files. You can load a non-DRM epub to see how a document is formatted, but basically it is just one or more HTML files for the chapters, images, and a css file for formatting. You can then generate a Table of Contents, cover page, etc. Sigil would also be handy for tweaking any epub you generated through Pandoc.

Using Sigil meant I had to convert the LaTeX source to HTML. I used Pandoc, but its LaTeX->HTML conversion is imperfect, it gets you about 90% of the way. You have go through and fix broken tables, images, and labels/references.

Once I had finished the epub, I used an online service (Epub2Mobi) to convert it to a Kindle format, since we wanted to make the text available for Kindle as well as the Nook and iBooks app. Alternately, you can use Amazon's free Kindle tools to create your Kindle book (link.

Because of the headaches with the LaTeX->HTML path, I wouldn't recommend it unless you had a compelling reason (for fancy PDF formatting.)
posted by insert.witticism.here at 7:01 AM on July 5, 2013

And after rereading your question, I realized I lost track of some of the details you described. (Time to get to bed.) In your case, LaTeX might be the ticket. But look at how the Pro Git book was written and generated. Follow my Pandoc help link and generate the book for yourself to see how the process works. That might be good enough for you.

If you decide to go the LaTeX route, there are a couple decent LaTeX book templates scattered around online. Everything else I mentioned will apply for that production chain.
posted by insert.witticism.here at 7:09 AM on July 5, 2013

Sphinx is very nice if the source code is Python, but can be a bit of a pain if it isn't.

I'd recommend LaTeX. The package you want for including nicely formatted code is listings
posted by RonButNotStupid at 7:49 AM on July 5, 2013

definitely LaTeX.
posted by cupcake1337 at 8:48 AM on July 5, 2013

It seems that there are lots of LaTex folks here, and that is great, but I would slightly question whether that would totally make sense in this case. There aren't that many equations- this is going to be code-block heavy.

docbook was invented as a standard to write programming textbooks, and nearly every book about programming (Safari, O'reailly, etc) is written using docbook becasue it works so well for it. It is at least worth taking a look at.

It looks like pandoc plays well with docbook, and so a markdown-pandoc-docbook workflow would be totally possible.
posted by rockindata at 9:23 AM on July 5, 2013

nthing LaTeX. You can use Lyx or Gummi to make your life easier. Most of my computational biology textbooks were written in LaTeX.
posted by gemutlichkeit at 12:44 PM on July 5, 2013

I used LaTeX with the listings package for my printed CS homework... everything from simply printing out pages of source code to mostly natural language with occasional snippets of source in various languages. I like it a lot.

That said, if you keep your markup simple and semantic, you can probably write a script to translate between modern markups without too much difficulty, and keeping the markup clean is a good practice anyway.
posted by anaelith at 3:28 PM on July 5, 2013

An alternative to pandoc (which I also recommend) is asciidoc, which can also ultimately output both PDF and EPUB. Just use whichever seems more natural to you.

FWIW, O'Reilly Media has largely moved from DocBook XML master files to asciidoc (which then outputs DocBook and flows through their existing toolchain).

I would not recommend hassling with LaTeX unless you had a lot of math to deal with, and then you're going to run into trouble when formatting that math for most ereaders.
posted by nev at 6:35 PM on July 5, 2013

« Older Good CMS for community website?   |   Head of horrors Newer »
This thread is closed to new comments.