CJK Word translation macro
March 30, 2008 8:43 PM Subscribe
I need a Word macro that will go through a folder, run word/character counts on each file in the folder, and deliver a total count of Chinese/Japanese/Korean characters, ignoring all English. This seems like it shouldn't be hard to script, but insufficient leetness on my part (and the fact that I've never written a macro before) means I'm more or less clueless here. What am I missing?
Response by poster: Yeah - the few times I've used the built-in CJK count in Word, it seems to be counting full-width Chinese punctuation (in GB2312, at least) as a CJK character.
posted by bokane at 9:33 PM on March 30, 2008
posted by bokane at 9:33 PM on March 30, 2008
Best answer: gak. icky. half-arsed attempt in your inbox.
posted by pompomtom at 1:45 AM on March 31, 2008
posted by pompomtom at 1:45 AM on March 31, 2008
A friend of mine has put together something like this.
posted by adamrice at 7:26 AM on March 31, 2008
posted by adamrice at 7:26 AM on March 31, 2008
Response by poster: Here's my (very, very) slightly modified version of the code pompomtom was kind enough to send me FTW:
posted by bokane at 11:50 PM on March 31, 2008
Sub CountNonEnglishCharacters() Set BadChars = New Collection BadChars.Add Item:="""" BadChars.Add Item:="(" BadChars.Add Item:=")" BadChars.Add Item:="1" BadChars.Add Item:="2" BadChars.Add Item:="3" BadChars.Add Item:="4" BadChars.Add Item:="5" BadChars.Add Item:="6" BadChars.Add Item:="7" BadChars.Add Item:="8" BadChars.Add Item:="9" BadChars.Add Item:="0" BadChars.Add Item:="%" BadChars.Add Item:="^#" BadChars.Add Item:="^w" BadChars.Add Item:="^$" BadChars.Add Item:="&" BadChars.Add Item:="," BadChars.Add Item:="/" BadChars.Add Item:="-" BadChars.Add Item:="." BadChars.Add Item:="'" BadChars.Add Item:=":" BadChars.Add Item:="。" BadChars.Add Item:="," BadChars.Add Item:="(" BadChars.Add Item:=")" BadChars.Add Item:=" " BadChars.Add Item:="【" BadChars.Add Item:="】" BadChars.Add Item:="[" BadChars.Add Item:="]" BadChars.Add Item:="《" BadChars.Add Item:="》" CurrentFile = Dir("DIRECTORY PATH*.doc") TotalCount = 0 While CurrentFile <> "" ' MsgBox ("c:\temp\Marbridge\" & CurrentFile) Documents.Open FileName:="DIRECTORY PATH" & CurrentFile, ReadOnly:=True For Each candidate In BadChars With Selection.Find .Text = candidate .Replacement.Text = "" .Forward = True .Wrap = wdFindContinue End With Selection.Find.Execute Replace:=wdReplaceAll Next candidate ThisCount = ActiveDocument.Characters.Count TotalCount = TotalCount + ThisCount ActiveDocument.Close savechanges:=False CurrentFile = Dir Wend MsgBox ("There is a total of " & TotalCount & " characters.") End Sub>
posted by bokane at 11:50 PM on March 31, 2008
This thread is closed to new comments.
A quick approximation would be to count every character outside of the Ascii range, but that could overcount some things.
posted by delmoi at 9:31 PM on March 30, 2008