Help me highlight Asian characters in MS Word?
How can I find and highlight non latin characters in a Microsoft Word?

I have documents containing Japanese, Chinese, Korean and Vietnamese characters. (hanzi, kanji, hanja etc.)

I want to streamline the process of finding and highlighting those characters in a document.

I can find and replace text by all manner of font and formatting options but don't seem to be able to search by non latin characters.

Any ideas?

I currently use Word Starter but I'm happy to upgrade or to use a different program entirely if it'll give me highlighted documents that I can print out!
Best answer: Word's regex implementation is incomplete, but it is possible to find non-Latin alphanumeric characters with the expression [!a-z A-Z 0-9]. Note that you will have to add additional values to ignore punctuation and special symbols, and also paragraph marks and manual line breaks and the like.
I'm not sure whether Word Starter has this functionality or not and haven't tried it with all these languages, but in Word 2003:

- open the Find dialog (Ctrl-F)
- click on the button that says More (bottom row, rightmost button)
- click on the button that says Format (bottom row, rightmost button)
- click on Language from the menu that pops up
- select a language, hit OK
- back in the original dialog box, click on the Replace tab (near the top, to the right of the Find tab)
- that tab has two text entry fields. The first one should be empty, and under it should be the name of the language you selected.
- click inside the second field, which should also be empty.
- click on the Format button again and select Highlight.
- click on the Replace All button

note: this will use whatever highlight color you currently have selected. If you want different colors for different languages, click on the highlight color selector (in 2003 I think it's on the formatting toolbar by default) and pick a different color in between Replace Alls.

If the Replace All doesn't seem to work, make sure you actually have a highlight color selected.
Response by poster: Izzy has it with the regex tip. I had no idea that I could use regex in Word and now I can I'm a very happy typesetter!

I've set up a few regex queries which I run as replace functions highlighting in a different color each time so I can clearly see in different colors non-latin characters and romanji characters.


trig, I'd tried searching by language before but Word (Starter version at least) didn't find the non latin characters. It actually selected the whole document which makes me think that it decides based on the language the document is assigned rather than individual text characters. Thanks for the try though!
