Word master document creation and maintenance
Creating and maintaining a master document in Word with sections from other documents.

I have a directory full of Word documents that are all similarly formatted with Word heading outlines -- each document has a section called "table of contents," "summary," "requirements," etc. They're real outlines, so if I go into, say, outline or document map viewing mode, they display correctly as collapsible trees. The content under the headings is in paragraphs and bullet lists, and it'll be continually updated for the forseeable future.

I want to create a single document that contains all of the other documents' "summary" sections. For each document, print its name, then the contents of its summary section. I would like to update that master document with the other documents' contents on a regular basis, and I don't want to do it by manually copying and pasting. I also don't want to make a real "master document" with "subdocuments," because I don't want to manually handle each document -- we only have about 20 now, but in the future ...

Initially, I thought I'd write a Python script using the Word object model, but the Word object model doesn't seem to support selecting text by its heading. I wouldn't mind programatically converting the documents to RTF or XML as an intermediary step, and XML looks like it might be a candidate -- since each document has a table of contents, each heading section has an automatic bookmark, but the bookmarks are named with random numbers. I could look at the table of contents, pull out that random ID for the heading I want, then go to the corresponding bookmark and select all text until the next bookmark. Unfortunately, Word-generated XML is so full of formatting crap, it seems exceptionally difficult to get the final text formatted the same as the original, bullet points and all.

Anyway, if I had a script that went through all the documents, pulled out the text I wanted by heading, then wrote it to a new master doc every time I ran it, I'd be happy.

Any ideas? I've searched MSDN and Google Groups and our own AskMeFi archives for the last couple of weeks to no avail, but I can't imagine that nobody's had this problem before.
Try opening these things in OpenOffice, saving them in .odt format, then having a poke around inside the .odt files to see if the XML inside it is more useable (an .odt file is in fact just a renamed .zip archive with a well-defined substructure). I have no idea whether this will work for you, but at least it's an idea.
posted by flabdablet at 12:59 PM on January 18, 2007

This is what VBA is for.

If you have defined sections, you just need your master doc to have a list of the subdocs, or loop through the directory of them, and copy/paste the chunk you want.
posted by pompomtom at 5:03 PM on January 18, 2007

