Join 3,431 readers in helping fund MetaFilter (Hide)


Mass Cyrillic to Latin translation in Excel?
September 1, 2005 6:25 PM   Subscribe

I need help finding or creating a macro to transliterate Cyrillic characters to their Latin-alphabet, English-language "equivalents" without doing a massive find-and-replace by hand.

The project is relatively straightforward:

I have an Excel file with 201+ pages of Russian names that I need transliterated, while leaving one or two other fields on the worksheet untouched. I have already converted the original Cyrillic font to unicode (thanks Firefox!!!), and know that it is possible to do the transliteration with a macro and Office 2003 for Windows.

I'm using OS X.4 and Office 2004.

Even worse, I'm working in Excel, and am not particularly enthused about exporting over eight thousand names to Word. What are my options?
I can't get the windows version of the macro to work on my Mac, can I? Is there any way I can avoid doing a massive "find-and-replace" project by hand? I've found this free macro online, but it's outdated.
posted by michaelkuznet to Computers & Internet (5 answers total)
 
$%@#! Title should say "transliteration," not "translation." I was being so careful... ;-(
posted by michaelkuznet at 6:30 PM on September 1, 2005


Here's one (quick and dirty) way to do it:

Assuming that all the cyrillic names are in column A and you want the latin names in column B:

0. MAKE A BACKUP OF YOUR FILE -- seriously, I just whipped up this code and it may be buggy.

1. open up Tools > Macros.. > Visual Basic Editor

2. Paste the following code in Sheet1 or somewhere in the workbook

3. fix the cyr and lat strings so the have all the appropriate characters in order (in my case, I used numbers instead of cyrillic characters and only went up to the 9th number)

4. make sure the sheet with the names is the active one in Excel. (if you want to change other columns, replace the 1 and 2 in the "Cells" commands with the number of the appropriate source and target columns (e.g., Cells(x,5) refers to column E))

5. make sure the cursor in the Visual Basic window is in the "Convertcells" sub and click the little "play" button on the toolbar.
Const cyr = "123456789" 'the cyillic alphabet in order
Const lat = "abcdefghi" 'the latin alphabet in the same order

Function convertchar(inchar As String) As String
Location = InStr(cyr, inchar)
if location > 0 then convertchar = Mid(lat, Location, 1) else convertchar = " "
'this will return the appropriate latin letter or a space if there is a problem
End Function

Sub convertcells()

x = 1
Do While ActiveSheet.Cells(x, 1).Value <> ""
thisone = ActiveSheet.Cells(x, 1)
out = ""
For y = 1 To Len(thisone)
out = out & convertchar(Mid(thisone, y, 1))
Next
ActiveSheet.Cells(x, 2) = out
x = x + 1
Loop
End Sub
I didn't spend any time testing this, and I'm assuming that VB and Excel handle unicode appropriately.
posted by i love cheese at 7:44 PM on September 1, 2005


There isn't a one-to-one mapping between letters in the Cyrillic and Latin alphabets, as there are Cyrillic letters that translate as 'sh', 'ch', 'ts', and so forth. I haven't looked at the Office 2003 macro code, but (assuming the source code is available) it might be possible to see what its logic looks like and use that to write a version that works on OS X and in Excel. The free macro you've linked is just a recording of a bunch of search and replace commands, so maybe you could try something similar in Excel if it has record functionality. The free macro also seems to ignore the letter i kratkaya (ะน), which could affect the quality of the transliteration.
posted by komilnefopa at 8:37 PM on September 1, 2005


There is an application that does this for OSX entitled "CyrillicEncoder"

http://www.versiontracker.com/dyn/moreinfo/macosx/13551

You can go back and forth, if I remember correctly, which actually makes text entry INTO cyrillic easier.

Hope that helps!
posted by sirion at 8:51 PM on September 1, 2005


Awesome!
ILC- I think your code is sweet; I have some other ideas where stuff *is* letter to letter, and that is excellent.
Sirion- perfect! This is what I ultimately ended up using. Heck, I think I'll hold on to the program for a while; it's just fantastically nifty. For future reference, do not try to paste 8,324 words into the program at once; it tends to slow things down.
Big Thanks!
posted by michaelkuznet at 10:56 PM on September 1, 2005


« Older Is there a safe, easy way to g...   |  The Fortune 500 company that I... Newer »
This thread is closed to new comments.