How do I create a book index?
January 22, 2004 3:23 PM   Subscribe

How do they create book indexes? I have a friend who's a grad student, and she needs to create an index for her hundred page scientific thesis. Is there any easy way to do it in MS Word, or another software product -- something that a non-geek could churn and burn with in a couple of hours?
posted by SpecialK to Writing & Language (17 answers total) 4 users marked this as a favorite
Not sure about academia but mainstream publishers pay specialists reasonably large dollars to do this. You could certainly generate a word list, linked back to the pages of appearance, with simple Perl/AWK scripting or similar (probably could find dedicated utilities on the web) but the art is in knowing what should and shouldn't be included. This is likely to take at least a few days for 100 pages.
posted by billsaysthis at 3:31 PM on January 22, 2004

What you do in Word (or FrameMaker, which is what I usually use for indexing a book) when making an index is mark each topic (not individual words -- an index is not simply a big dumb listing of everywhere a word is found; that's a concordance) under the different ways someone might look it up. For example, in a user manual, I might mark a section on a "network preferences pane" with "network preferences" "preferences: network" (in FrameMaker the colon indicates a subtopic; there would be a "preferences" topic and then a list of the various sorts of preferences under it) and "settings: see preferences". If networking was a big topic in the book I might also mark it for "network: preferences" and maybe also "Internet: see also network." Once you have gone through a document and done this with every salient topic, the software can then whip through the document and generate the index.

Ideally you would be doing this as you wrote the document, but nobody ever does.
posted by kindall at 3:42 PM on January 22, 2004

MS-Word does allow you to create indexes easily (thats a relative term) You just go through and mark all the words you want to include in your index and then it will go through and creat an index of them.

So, it does all the detail-oriented parts but you still have to decide what words to include.
posted by vacapinta at 3:46 PM on January 22, 2004

The American Society of Indexers has a list of available software. Quite frankly, she might as well stick with MS Word. If it's just a proper name index, execute "find" for each name in the bibliography and then alphabetize. A subject index will take considerably more thought. ON PREVIEW: as Kindall demonstrates.
posted by thomas j wise at 3:46 PM on January 22, 2004

I work in museum publishing and billsaysthis is right as far as what we do -- we hire freelance specialists, precisely because depending on the complexity of the index, it can be a pretty time-conuming, labor-intensive process. I would recommend taking a look at the Chicago Manual of Style, which has a chapter dedicated to indexing. As for software programs, the ones generally used by professional indexers often require more learning time than your friend probably has.

The 14th ed. of the Chicago Manual of Style (just recently out of print but easily available in libraries and used bookstores) also explains how to index the old-fashioned way -- with a pile of 3X5 index cards.
posted by scody at 3:55 PM on January 22, 2004

I edited a 250 page book recently and my co-editor who is a cataloging librarian, did the index for us. It took her the better part of a two weeks, possibly longer, to do. It is a mighty fine index, though. You can do an index yourself, but unless you know what you're doing [in terms of knowing what to index it, how to index it, how to deal with authority control issues, alternate spellings etc] it will be more hassle than it's worth; a bad index to a book is almost worse than no idex at all. If you really need to do it [or your friend does], I'd go with the ASI software list that tjw pointed out, or see if she could pay a library student a few hundred bucks to do it. Here's a primer to using Word to make an index. If it just has to make it past a review committee, this might be okay.
posted by jessamyn at 4:31 PM on January 22, 2004

I work for a legal publisher as an editor and although we farm out huge indexes, for small projects (i.e., less than 400 pages, say) we just do them ourselves. A hundred pages could be indexed in a simple Word file in just a few hours, whereas if your friend tries to learn a program for just this one project it may take longer than that.

I recommend that she go through her paper with a highlighter first and mark all the things she wants to include, and then open up a Word file and go to it, typing in words and phrases with page numbers and putting them in alphabetical order as she goes along.

And don't do it with index cards! She'll waste so much time shuffling through them for the one she wants and writing it all out longhand (most people type faster than they can write) and then have to type it all up on the computer. There's a reason why no one does it that way anymore - because a computer file makes it possible to insert text wherever you want on a list.
posted by orange swan at 4:31 PM on January 22, 2004

Step One is to find out from the school what the minimum acceptable standard for the index is (I assume it's a requirement), and do not one iota more than that.

If your friend is writing a 100 page scientific thesis, she might be using LaTeX, and there are indexing packages for LaTeX. It's just a matter of marking the words to be indexed.

If she's not using LaTeX, she probably should be. A full LaTeX/bibTeX/postscript/etc setup will translate on the fly between different citation systems, will assemble your references section for you, will generate much better-looking and better-behaved math than Word will, will typeset much more nicely than Word and vastly more consistently (ie, no more having your pagination change because you switched printers), will directly generate PDFs, and all for free. Many journals also have style or class files that generate output correctly formatted and cited for that journal, and many schools have style or class files that will generate correctly formatted and cited theses.

I recommend that she go through her paper with a highlighter first and mark all the things she wants to include, and then open up a Word file and go to it, typing in words and phrases with page numbers and putting them in alphabetical order as she goes along.

She can't do that until she's absolutely done done done with no need for revision, and even then there's a nontrivial chance that if she prints her thesis on a different printer than it was set for while writing, or on a different machine, that the pagination will change rendering all her manual work inaccurate.
posted by ROU_Xenophobe at 5:25 PM on January 22, 2004

It may have been a recent post here that someone mentioned the Twin Oaks community in Virginia,
and one thing I noticed on the site was that they did book indexing. Perhaps they have good rates?
posted by milovoo at 6:17 PM on January 22, 2004

Oh, and more importantly, they use Cindex. For Windows or Mac, Student copy 80$.
Free Demo copy available - ("The demonstration copy is full-featured except that it can accommodate only 100 index entries, and lacks spell-checking.")
posted by milovoo at 6:46 PM on January 22, 2004

I'm a legal editor as well. I suggest that if you do not have the software, the old highlighter will work just fine. I am assuming, an index for a 100 page thesis wouldn't be that complex.
posted by jasonspaceman at 7:53 PM on January 22, 2004

Response by poster: Thanks! The guide to how to index with Word will do just poyfect ... it needs the index because of the number of science terms that are in there. It's not a problem to put time in on the index, but it is a problem not to have one in the final copy...

All together now: "Thanks, AxMe!"
posted by SpecialK at 10:08 PM on January 22, 2004

I agree with ROU; LaTeX is the way to go. I would recommend starting with the Not So Short Guide to LaTeX2e (aka "LaTeX2e in 131 minutes"). Then, my favorite LaTeX environment for the Mac is TeXShop. Finally, there's the Comprehensive TeX Archive Network.

For a Windows LaTeX distribution, look at MiKTeX. I haven't found anything for Windows that's as nice as TeXShop on the Mac, though.
posted by mrbill at 11:58 PM on January 22, 2004

I haven't found anything for Windows that's as nice as TeXShop on the Mac, though.

Did you check TeXniccenter ? Available via
posted by swordfishtrombones at 1:40 AM on January 23, 2004

For a Windows LaTeX distribution, look at MiKTeX. I haven't found anything for Windows that's as nice as TeXShop on the Mac, though.

The shareware WinEDT bolts on top of mikTeX nicely, with pushbuttons for almost everything you'd ever want to do (tex, latex, dvips, gv, pdflatex, etc) and menu-insertion of most LaTeX code and math symbols. It has interfaces for R and some other stuff too.

Or you could just use a win32 version of emacs.
posted by ROU_Xenophobe at 6:47 AM on January 23, 2004

In case anyone ever needs to index a QuarkXPress or InDesign document, I've had good results with Sonar Bookends.
posted by Dean King at 8:31 AM on January 23, 2004

Myself, I would recommend Corel Ventura and/or Adobe Framemaker. They are about equal in features, with Corel having a slight edge on usability.

Quark and InDesign are not suitable for long-document publishing. InDesign simply does not have the required feature set, and Quark needs about ten grand of add-on modules to equal Ventura/FrameMaker.

Come to think of it, WordPerfect is probably an even better solution for a thesis work, with OpenOffice coming a very, very close second. WP has proven long-document capability, though without the easy layout features of FM/Ventura; OpenOffice should be its equal at this point. Both will likely require less learning than Ventura/FM.

Please, please, please: strongly recommend to your friend that she avoid MSWord at all costs. It IS NOT stable when dealing with long documents. She faces a very significant risk of losing everything.
posted by five fresh fish at 10:24 AM on January 23, 2004

« Older What scripting languages and/or CMSes are most...   |   Public photography laws Newer »
This thread is closed to new comments.