Software for long documents
March 6, 2008 3:42 AM   Subscribe

What Windows software should I use to write a ~70k word public health thesis?

I've read through all previous posts on LyX and LaTeX and the dangers of Word with long documents and have looked into other available programs (eg OpenOffice), but I'm still having a hard time deciding what will be best for my situation.

Everyone in my department, including my supervisors, use Word and they like to review my work using "track changes".

I won't need to include any formulas or chemical structures, just a stack of tables, graphs, pictures and the odd Greek letter.

I'm also quite invested in my scrupulously maintained EndNote database. I have many custom fields and pdf's linked for every entry.

I am hoping there is something out there that's stable, and straightforward, with powerful integration capabilities?
posted by bingoes to Computers & Internet (24 answers total) 10 users marked this as a favorite
Well, it's a tough choice. As you've discovered, Words performance on large documents can get sketchy. Not the sort of stress you need when writing a thesis. The usual advice is to use LaTeX but it's a serious commitment of time for a specialised skill. I know, mathematicians use it, I used it for my thesis and it was invaluable - but I still can't make a general recommendation. And you don't get track changes. And you wouldn't get to use Endnote, but instead the user-hostile BibTeX.

One workable solution I've seen: write each chapter as a separate Word document. Combine them only at the end. You won't want to pass around the whole thesis for proofing, anyway.
posted by outlier at 3:53 AM on March 6, 2008

Word sucks but it's what your using. I've recently reconciled myself to using Word, mainly by slavishily using styles always, and this. Word's one redeeming feature is that it's a pretty nice environment for doing tables.

Use headings captions and cross references religiously. Read this.

One document per chapter and knit together at the end.

Consider moving from Endnote to Zotero.

Never upgrade word or endnote until you're done with your thesis.

That will deal with 90% of the pain.
posted by singingfish at 4:09 AM on March 6, 2008 [2 favorites]

Otoh if I have the choice (i.e. working alone or with technologically literate colleagues) I'll always use LaTeX as a first choice.
posted by singingfish at 4:10 AM on March 6, 2008

I wrote my PhD thesis in separate chapters. Make sure you know what you're doing with a template before you start.
posted by biffa at 5:29 AM on March 6, 2008

Word can handle 70k+

We maintain heavy medical standards that authored in word, with tonnes of diagrams and tables. The average weight is 2-3Mb.

Amen to Use headings captions and cross references religiously. Follow the Document Outline.
posted by mattoxic at 5:44 AM on March 6, 2008

I'm in the same position as you, writing a public health thesis and feeling a little daunted by the process of learning LaTeX. Due to my feelings about Word having a hard time managing long documents, I've decided to take the plunge and just learn LaTeX. This has been a great help. You might also want to read this free WikiBook which I think I've memorized at this point.
posted by carabiner at 5:51 AM on March 6, 2008

Something else to consider when using Word is to "save as" using a different file name every so often (I'm paranoid and do it after every major revision of my papers). That way if your long document does get corrupted, you at least don't have to start from scratch.
posted by jmd82 at 6:09 AM on March 6, 2008

I learned LaTeX specifically to do my thesis. Previous stuff was always in Word. I would not want to do a multi-chapter document in Word.

Use JabRef to manage BibTeX files—it can import from EndNote, or fetch citations directly from PubMed.

TeX FAQ on tracking changes.

Don't use for this.
posted by grouse at 6:18 AM on March 6, 2008

You can make Word perform a lot better by turning on picture placeholders so that it's not constantly having to render images and graphs while you're moving around the document.

I'm doing my thesis in LaTeX, and I do love the output, math support, and the speed of editing plain text files but if I didn't have quite a lot of equations and a LaTeX-friendly supervisor I don't know that it would really have been worth the trouble.

Word's one redeeming feature is that it's a pretty nice environment for doing tables.

I was surprised to find what a pain tables in LaTeX can be, especially if they cover more than one page.
posted by tomcooke at 6:41 AM on March 6, 2008 [1 favorite]

Best answer: Word can handle 70k+

We maintain heavy medical standards that authored in word, with tonnes of diagrams and tables. The average weight is 2-3Mb.

Not 70KB, 70K words, or 275--300 pages.

The kicker is the readers/advisors. Over the whole time spent writing the thesis, your life would be easier, net, if you bit the bullet and did it in TeX... except for arguing with your advisors and waiting forever to get hardcopy back and so on. You might as well stick with Word, doing things like this:

(1) Save each chapter to separate files. You don't need to worry about using whizbang stuff to reintegrate it if you don't want to; the important thing is just that when Word fucks up Chapter 3, that doesn't spread to the others.
(2) Have a multi-tiered backup strategy. Save the current version, and back it up. But also have the last version and the last but one version backed up as separate files. That way, when you realize today that yesterday's work fucked up the files so now its shit's all retarded, you can just load the day-before-yesterday's version.
(3) When you save the current day's work, also save it as an RTF and/or plaintext.
(4) Keep all tables and figures as separate files, in addition to the embedded versions.
posted by ROU_Xenophobe at 7:35 AM on March 6, 2008 [4 favorites]

And you don't get track changes.

Track changes is an editor thing, not a typesetter thing. LaTeX is just the typesetter, and you can use any editor that it pleases you to use.

If you really wanted to, you could send people your source and they could open it in Word and track changes to their heart's content.

And you wouldn't get to use Endnote, but instead the user-hostile BibTeX.

Meh. At its worst, BibTeX is as bad as writing

As Blah (2004) says, \nocite{blah2004} wombats are tasty.

I was surprised to find what a pain tables in LaTeX can be

Easy way out: assemble them in Excel and then dump to LaTeX with the xl2latex macro (or some name similar to that).
posted by ROU_Xenophobe at 7:40 AM on March 6, 2008

There's a great website out there with tips on using Word to write long documents -- e.g. using Styles to format your document template instead of weighing the document down with repetitive codes.
posted by rdn at 7:53 AM on March 6, 2008 [2 favorites]

I helped a friend of mine format her 400+ page dissertation a few years ago. It was chock full of figures, tables, and diagrams. If she had come to me for help at the beginning of her project, I probably would have told her to learn Adobe FrameMaker or LaTex. Now, I'd probably give OpenOffice a good look. If you decide to go with Word, the following tips might help:

1. Create one master document with links to several sub-documents. I'd recommend one sub-document for each chapter. This will avoid having one massive doc that can get corrupted.

2. If possible, use a small set of standard styles for everything. Trying to figure out why there's a quarter-inch of extra margin after one particular paragraph is a pain when there are 250 different formatting styles to account for.

3. Back up your documents regularly. I wish I could say that my experience with master documents was trouble-free, but it wasn't. If you interrupt a save in progress (even an auto-save), you're rolling the dice. There are several programs available that will automatically back up files for you. FileHamster, Mozy, and JungleDisk come to mind.

4. Disable fast save if you're using a version of Word newer than Word 2003 as it bloats the size of the file. (Fast save was removed in Word 2003.)

Good luck!
posted by braveterry at 8:13 AM on March 6, 2008 [1 favorite]

What every one else said.

Separate chapters; styles; a custom document template.
posted by notyou at 10:09 AM on March 6, 2008

I'm surprised to hear about MS Word problems with long documents. I work almost daily with 600+ pages, 100k+ words documents without any hack or any problems. On the other hand, we do keep backups
posted by racingjs at 10:11 AM on March 6, 2008

Same here, racingjs. Every six months our office produces a series of 67 Word documents that run about 400+ pages each. Each has about 40-50 embedded images or graphs, and literally hundreds of tables. I've never had a problem, but we also generate 99% of the document through code. The 1% we do manually at the end, however, we've never had problems with.

If you do go with Word, I can't nth enough what tomcooke said about turning on picture placeholders.

Ultimately, I think this decision isn't really that big of a deal if you backup religiously. The horror stories you hear about Word munging big documents probably all have the common thread of involving an author who for whatever reason didn't keep backups. What I've done a couple of times when working on SUPER LIGHTNING IMPORTANT documents is set up a macro in Word so that when I invoke the macro (e.g., via Ctrl-P), it will (a) save to the hard drive; (b) save to a floppy; (c) save to a second hard drive; and possibly even (d) upload a timestamped version of the file to an external site.
posted by Doofus Magoo at 10:46 AM on March 6, 2008 [1 favorite]

If it's going to be a technical document with a lot of medical terms, save yourself a bunch of time and grab a copy of Stedman's Medical Dictionary. It works with most word processors, and will really help cut down on auto-correct errors and manual dictionary additions.
posted by WinnipegDragon at 11:35 AM on March 6, 2008

If you're going to do this, Word 2007 is definitely worth the upgrade, in all manner of ways.
posted by bonaldi at 12:32 PM on March 6, 2008

With respect to LaTeX, you have to tune out the evangelists and ask yourself if you REALLY, REALLY need to invest the considerable time it takes to learn how to use (and use well). If you're not a mathematics or physical sciences person, I'd always recommend against it.

I used Word for a 125,000 word dissertation, broken up into chapters at first, then merged. I had very few problems with it. Word is certainly capable of handling a document this large.
posted by yellowcandy at 2:34 PM on March 6, 2008

I'd use Adobe FrameMaker. Word might never corrupt your file(s), but opening/saving/working in them will become slower and slower over time. Plus, if you're not strict about the way you handle your styles and formatting, trying to fix them later can get really hairy. I use FrameMaker 7.1 to work with large volumes of text (6000-ish pages when printed, so probably over 1 million words). Even a huge file saves in the blink of an eye in Frame. Plus, you can spread your work over several chapters (FM files) and combine them in a book, and you can still search across the book (unlike trying to search across multiple Word files). Frame is also a lot stricter about the way it handles styles and formatting, so it's a lot harder to mess up than Word.
posted by korres at 2:36 PM on March 6, 2008

Response by poster: Thanks for all the answers, hivey. I feel like I have now reached resolution in terms of my approach. Upgrade to Word 2007, save each chapter as a separate file, save to a new file, plus rtf each day. But I'll also take a look at the other software suggestions so I'm ready to go in case Word and I have a falling out.
Thanks again.
posted by bingoes at 2:54 PM on March 6, 2008

Some important advantages of LaTeX, and how they don't seem to apply to you:

1. It forces you to do all the good things (like using cross-references properly) that singingfish recommends. But now that you've read singingfish's post, this doesn't apply to you.

2. It's pretty good at maths layout. This doesn't apply to you.

3. It has first-mover advantage, and now many journals require its use. This doesn't apply to you.

4. It's free. This may not apply to you.

5. It's cross-platform. This may not apply to you.

And LaTeX has many disadvantages, so I'm glad you chose Word.

I also want to follow singingfish in recommending that you consider Zotero.
posted by hAndrew at 6:39 PM on March 6, 2008

Writing a long and important one-off document requires an overall strategy of planning and precaution. First off, make sure you have incremental backups. I use the Mozy Free offsite backup service, because it works transparently. This way, if everything goes down the chute, you can probably pick up your data from a couple of hours ago. The free service gives you 'only' 2 Gb of space, so you can keep all of the relevant files in one directory and have that as your backup target.

I heartily agree with the use of a stylesheet. Set up paragraph styles and templates from the beginning. You might actually be able to download an official thesis style and template from your university or department. It's the 21st century, man! Strap on your rocket belt and visit your university library or graduate offices. They may have advanced beyond a printed specification, and will direct you to their downloads page. It'll save you a lot of headaches. There's nothing like the heartbreak of finding that your margins are not mirrored and your running header doesn't have the correct information.

Be sure to use header and subhead styles absolutely consistently. The Word Outline View is really good for this. It will give you an overall picture of the logical structure of your document. In addition, those section headers will automatically create a Table of Contents whose page numbers are dynamically linked, so you don't have to manually adjust them every time you edit some text.

Remember to make use of object anchoring, keep together/keep follow, Split or noSplit tables, and text wrapping around objects. These will make embedded tables and graphics much easier to live with. When you have some kind of data table whose source is a spreadsheet file, consider using object linking and embedding to let the table update itself when you have to adjust the data halfway through writing. Otherwise, you'll have to constantly remember which tables need to be re-written when your spreadsheet data changes. The same goes for charts and graphs. Even Powerpoint slides can be object linked.

The point of all these tools is to let you concentrate on your words, and leave the layout and formatting to the machine.
posted by DanYHKim at 5:05 AM on March 7, 2008 [1 favorite]

Tools to convert endnote to Bibtex
posted by zouhair at 1:12 AM on March 9, 2008

« Older Help my wrists deal with Eclipse   |   Strings and String Theory Newer »
This thread is closed to new comments.