How would you write/publish an e-book?
February 27, 2004 2:11 PM   Subscribe

How would you write/publish an e-book? [more inside]

I'm interning at my university's press, and we are interested in:

A) creating electronic copies of old, out of print books in our catalog. these would most likely just need to be a searchable file, with no added links or information. anyone know software that would scan and convert it? what kind of file would we be getting? pdf?


B) taking a mostly complete book, and giving it the so-called super-deluxe e-book treatment. this would include tons of embedded links, metatags, the whole shebang. again, what kind of language/software/file would we want to use?


C) starting from scratch with a new book, again going for the deluxe treatment.

a few conditions:
1) We must retain the copyright to anything we do.
2) We must be able to sell it, thus no conversion software that restricts this. Also, what would be the pros/cons of selling a CD/DVD vs. secure web access/download?

Any specific industry knowledge would be greatly appreciated, but i'm also kind of curious what people would think to do off the top of their head. If you wanted to write an e-book tomorrow, where would you start?
posted by rorycberger to Media & Arts (5 answers total)
Well, there are any number of ways. As Joe Clark points out, you can only enhance plain text in one way, which is to structure it in some meaningful sense.

So I'd start by creating .txt files of your chosen opuses. The brute-force way is to scan in texts with an OCR, and prune the output closely for errors before saving as plain text. You can then lard these liberally with whatever links, easter eggs, and other goodies you wish.

For selling them, I might consider a micropayment facilitator like bitPass.

Good luck.
posted by adamgreenfield at 2:21 PM on February 27, 2004

I don't understand the difference between writing an 'e-book' and writing a plain old 'book', which is to say that I don't think there is one.

The 'e-' is just a distribution method for the content of the book, using bits instead of atoms.

With regard to format, I'd advise plain text to start, as Adam suggests. That can then be ported and gussied up in any way you like. Keep it simple, and remember that if the book is something that many people want, once it's digitized it will be pirated, regardless of any fancy-schmancy safeguards you attempt to put in place to protect it.

The way that Cory Doctorow has been approaching the simultaneous release of his books in bits and atoms, even going as far as allowing 'remixing' in his Creative Commons licences, is eminently smart, I think.
posted by stavrosthewonderchicken at 5:17 AM on February 28, 2004

Response by poster: As far as piracy goes, that isn't really a concern for us at all. Though I don't deny that it probably will happen in some form or other, we are an academic press, and thus primarily sell to libraries, particularly university libraries, and they tend to pay for things rather than steal them. We do need to be able to charge money for it, but we would only need to use minimal security.
posted by rorycberger at 12:05 PM on February 28, 2004

I would go with pdf, especially if you want to keep the styles and formatting of the original...books originally created in quark can be converted really easily, or if you have digital plates from the printer.
posted by amberglow at 12:47 PM on February 29, 2004

Absolute plain text is the worst possible option, as Adam pointed out by linking to me. EuroCory is simply wrong about that, as is everyone who agrees with him.

There are lots of micropublishers producing E-books right now; they just aren't mainstream-popular, meaning that few of us have ever heard of them. They nearly all use PDF and they almost never charge more than US$10.

PDF is not at all a bad format when handled properly. You can OCR your scanned original copy right in Acrobat. It will make all the usual OCR errors, correction of which I do not quite know how to do. (I've only done the exercise once.)

It is thus possible, if you really need to, to produce a PDF with page images and underlying text, if you want to reproduce the typography of the book.

If you're producing from scratch, for the love of God buy InDesign and use that. First of all, you get unimaginably better typography, but better yet, if you use well-threaded text and graphics frames you can export a tagged E-book automatically. This same tagging makes the book accessible to some, though admittedly not all, screen-reader users. (The tags are XML. You will have to clean them up a bit here and there in Acrobat 6, but that is a trivial job in most cases. If you're super-l33t you can program your own application to produce the XML.) Don't forget the Reduce File Size command in the File menu.

Interestingly, Safari and Firefox on OSUX seem to be able to open a PDF right inside the browser with no trouble at all (or any helper programs). Some people will want to read everything in their browsers. Thus I would also consider HTML. I'm not kidding. A nice valid HTML+CSS book can be read on a zillion devices-- everything from Microsoft Word to something really primitive like Microsoft Internet Explorer for Windows.

I simply wouldn't bother with proprietary E-book formats.
posted by joeclark at 3:58 PM on February 29, 2004

« Older Server Log Mystery   |   iPods in Rental Cars Newer »
This thread is closed to new comments.