Change Margins on Project Gutenberg Books?
May 19, 2005 8:54 AM   Subscribe

Is there an easy way to change the margins on Project Gutenberg e-books?

I can print anything I want here at work for free. I would like to print Project Gutenberg e-books to read at lunch, but the margins are all set at about 75 characters. That is, when a line reaches about 75 characters, a line break is placed at the end of the line. Paragraphs are seperated by two line breaks.

I would like text in each paragraph to wrap (not break at 75 characters) so that I can more efficiently print a novel on 8.5"x11" paper double sided, with .25" margins on the sides, in something like 10pt text.

As it is, I have very few options for more efficiently utilizing the space on each printed page. I can reduce the text's font size, but that would only make more white space on the page.

One workaround I found was printing a novel double-sided on legal (8.5"x14") paper in landscape format with three columns. I would prefer to not do this again.

Is there a macro in Word or a perl script or utility that can change this formatting? Is anyone else annoyed by Project Gutenberg's forced margins?

I do not know how to program. I am running Windows. I have access to a shell account on a Redhat Linux system. Yes, I am sure I want to print novels and not get them from the library.
posted by redteam to Computers & Internet (9 answers total)
 
Best answer: GutenMark is a tool that can convert Gutenberg etexts to nice-looking HTML or even LaTeX format with proper italics, quotation marks, etc. The HTML is probably sufficient, and you could import that into Word, adjust margins, etc.
posted by zsazsa at 9:15 AM on May 19, 2005


If you've got access to something UNIX-like, you could use sed. Something like (not tested, and I'm very rusty):

cat book.txt |sed s/\n/\ / >formatted.txt

UNIX gurus please don't mock or belittle me for my humble offering. I was once wise like you.
posted by veedubya at 9:16 AM on May 19, 2005


Response by poster: A-derrrrrr. Ok, I saw that earlier when I was searching, but I thought it was only a LaTeX thing. I will see if it does what I want it to. Thanks.
posted by redteam at 9:17 AM on May 19, 2005


Response by poster: GutenMark is an amazing tool. It does pretty much everything I wanted and way more. I don't know how I didn't see it before.

Thanks for your line of sed, veedubya! It looks like a plain and simple solution. I'm going to give that one a try when I get home (I can only access my shell account from there).
posted by redteam at 9:45 AM on May 19, 2005


Blackmask has many of the Gutenberg books but formatted into other formats: PDFs, plaintext, Mobipocket etc.
posted by vacapinta at 9:53 AM on May 19, 2005


It looks to me like that sed line will replace all line breaks with spaces. Probably not exactly what you want. I imagine there's a regular expression out there that will be able to handle it though.
posted by neckro23 at 10:08 AM on May 19, 2005


The usual UNIX way to do this is 'fmt'. -w specifies the width. It's pretty simplistic, but would probably be adequate for this.
posted by sfenders at 11:39 AM on May 19, 2005


Oh, and using sed to do approximately the right thing is slightly more complicated. To remove all the newlines, except for those followed by a blank line:

sed -r ':start
N
s/\n[ \t]*$/\n/
t yes
:yes
s/\n([^\n])/ \1/
t start
p
d'
posted by sfenders at 12:24 PM on May 19, 2005


Here's what I do for the smallest number of pages on the printout:

On the Gutenberg site, highlight the entire text (Ctrl A) and copy it to the Clipboard (Ctrl C).

In Word, paste with Paste Special | Text. [This puts in the text with hard returns.]

Delete the Gutenberg boilerplate beginning and ending.

Set the left and top margins to 0.5 and the right and bottom margins to 0.25.

Insert page numbering (I like it at the upper or lower right)

Replace all double returns (^p^p) with an arbitrary symbol that doesn't appear in the text (I use ``). [This gets rid of the extra spacing between paragraphs and keeps the paragraphs from running together in the next step.]

Replace all single returns with a space. [This gets rid of the hard returns.]

Replace all the arbitrary symbols (``) with a single return (^p). [This restores the paragraph breaks.]

Highlight the entire document and set the Format | Paragraph | Special for an initial 0.25 indent [or create and apply a style to do this].

Unindent the chapter titles, or apply the Normal style.

Format | Columns to double column.

Print, find a heavy-duty stapler and staple at the upper left corner with the staple running vertical (which makes the pages easier to turn).
posted by KRS at 8:04 AM on May 20, 2005 [2 favorites]


« Older Japanese short story problem   |   Looking for a quote Newer »
This thread is closed to new comments.