e pluribus paginis unus
November 18, 2011 9:51 AM   Subscribe

How can I convert a web site, which is organized as a table of contents linking to 200 or so individual html "chapters", into one ebook?

I would like to read this book, linked this morning on the blue, on my ereader in epub format. I know Calibre can convert html to any ebook format, but I'm having trouble figuring out the best method of converting 200 html files into an easily navigable and readable format for an ereader.

I'd like to avoid two hours of copying, pasting, and cleaning up. I guess I could laboriously add every single page to Instapaper and then download the result as "unread." But I thought there might be a better way. Might not.

Li'l help here?
posted by General Tonic to Computers & Internet (10 answers total) 1 user marked this as a favorite
If you've got some programming/scripting chops it wouldn't be terribly difficult to mix a little wget and cat and get where you're trying to go.

The instapaper idea is pretty good, though. If you're using Chrome for your browser, you could install the Instachrome extension and right click on each link and send it to instapaper.
posted by ndfine at 10:36 AM on November 18, 2011

Link to Instachrome
posted by ndfine at 10:37 AM on November 18, 2011

What kind of system are you on? Windows, Mac, *nix?

On linux, and probably Mac (and maybe Windows?), you can use wget or something similar to mirror the whole site locally.

Then it should be relatively easy to write a script to strip out all the header info and cat all the files into one big one. From there Calibre will do the rest of the job for you.
posted by Mister_Sleight_of_Hand at 10:40 AM on November 18, 2011


Or, you know, exactly what ndfine said. *sigh*
posted by Mister_Sleight_of_Hand at 10:41 AM on November 18, 2011

Calibre has a web2disk that does exactly what you need.
posted by clearlydemon at 10:47 AM on November 18, 2011 [1 favorite]

I used Acrobat to index the site and converted it to an epub file in Calibre. Odd little book.
posted by michaelh at 11:32 AM on November 18, 2011

Yeah, seems like Instapaper is the way to go. I was hoping to do this with around a dozen mouse-clicks, but it looks like I'm not getting away with anything less than about 400 clicks. Oh well.
posted by General Tonic at 2:13 PM on November 18, 2011

Acrobat Pro can let you send a whole website to pdf.

Then send the PDF through calibre.
posted by wenat at 8:02 PM on November 18, 2011

clearlydemon: How do I implement web2disk?
posted by General Tonic at 8:25 PM on November 18, 2011

web2disk is a command line tool that is installed with Calibre.
If you are using OS X, you must follow these instructions before using it.

Then, in a Terminal, Console or Command Prompt window (depending on your OS), type:

web2disk http://www.xenology.info/Xeno.htm

You might want to play with the options to define a directory to download the website or to define the depth of links that should be downloaded.
posted by clearlydemon at 11:43 AM on November 19, 2011

« Older Was this skillet cast iron?   |   Please recommend craft ideas for Winter gift... Newer »
This thread is closed to new comments.