Join 3,377 readers in helping fund MetaFilter (Hide)


How can I find all the acronyms in a large Word document?
April 9, 2009 9:20 AM   Subscribe

Is there a Word plugin or standalone program that will search a Word doc and list all occurrences of a given string, preferably with surrounding words for context?

I work with Federal contract proposals. Very long, arcane documents packed with acronyms. Each acronym is spelled out the first time it appears in a top level section. Then it can stand alone through the rest of that section, but has to be spelled out again on first appearance in a new top level section. That sounds simple, but sections are usually written by different people, or pulled from the boilerplate library and modified, so what we end up with is acronyms being spelled out randomly, way more often than they need to be. Tracing each acronym (usually hundreds) through a document that may well run 75 pages and making sure each one follows that style is a hugely laborious process that comes right on top of deadline. It's not so much that I object to the tedium as that we usually just don't get the time to do this properly. So I'm looking for a tool to help automate it.

Yes, there's Word's find command, but that's proving terribly awkward in practice. I figured there had to be a plugin that would handle this, but the only acronym-related software goodies I've found for Word want to look the acronym up and tell you what it means. I know what it means. I just need to find all instances of it and figure out which ones should be spelled out and which ones actually are.

Next I considered advanced search toys, but things like Google Desktop want to go through all your stuff and identify multiple files. That's not the problem. Only the current file matters.

What I think I need is something that will search the particular file I'm in for a term and give me something like what Google gives you for the whole net. A list of hits for that term in that file, with maybe ten words on either side for context, and a page number. If it could tell me what numbered section each one is in, that would be ideal. But that's probably pushing it.

Anyone got any ideas for something that will do what I'm asking? Or a better question to ask in order to get the ultimate result?
posted by Naberius to Computers & Internet (4 answers total)
 
I am not sure if this would work with all of the formatting characters that are in a Word document, but Notepad++ is a great program for searching files for a string. You can narrow your search to a certain directory, as well as a certain file extension.

It will then list the line number, the file name and then the whole line where that string is found. If you want to limit your search to just one file, you can put that file name in as the restriction and it should search just that file for you.

It's worth considering if nothing else you find will do.
posted by Brettus at 10:23 AM on April 9, 2009


Use a macro. This one will highlight the entire sentence. Modify it as needed.
Sub Macro1()
'
' Macro1 Macro
' Macro recorded 4/9/2009 by Russell Hyland
'
Selection.Find.ClearFormatting
With Selection.Find
.Text = "looking"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute
Selection.Expand Unit:=wdSentence


End Sub
posted by RussHy at 11:03 AM on April 9, 2009


The "Search" (which is different than "Find") feature in Adobe Reader works pretty much exactly as you describe. Try saving your Word document as a PDF and opening it in Reader.
posted by oulipian at 1:10 PM on April 9, 2009


I'm a big fan of textual analysis & did a lot of research into this a while back.

I would try these three programs in this order.

TextStat (freeware, reads Word docs)

Textanz (shareware)

Concordance (shareware)
posted by MesoFilter at 5:06 PM on April 9, 2009


« Older How can I make sure my cat has...   |  How do I balance my karma for ... Newer »
This thread is closed to new comments.