Text-To-Phoneme Conversion Software
April 12, 2006 8:04 AM   Subscribe

I am looking for a good Text-To-Phoneme software. Any clue?

I know of a lot of Text-To-Speech solutions, but I'm looking for something that will look at (primarily English) text and convert it to the text of the Phonemes, so if I have the text "My Generation" it would reply with "m AY j EH EH n ER EY P1 SH AX n" or ideally the correct Sampa phonemes. Little help over here?
posted by Overzealous to Technology (4 answers total) 1 user marked this as a favorite
This won't be a ready made solution, but it is a dictionary of pronunciation of English words.

This gets you half way there. The other half is wrapping it in a lookup tool. I don't know how comfortable you would be with that. They have a web based query form, but I don't know how much text you need to convert. If it's substantial you'll need to download the entire dictionary and go from there. (Or poke around the site some more, maybe I missed the exact program that would fit your needs.)

These tools (and more) are part of Carnegie Mellon's (computer) speech program.
posted by voidcontext at 8:49 AM on April 12, 2006 [1 favorite]

The problem with this is that there is virtually no demand for it other then in linguistics, and they would probably be able to do that as easily as you or I walk. But you might want to take a look at Linguistics Computing Resources on the Internet and see if anything there would help you.

I'm pretty sure that IBM's Websphere Voiceserver could do this. It's very customizable.

You also might want to take a look at some cellular codecs. They don't do text to phonemes, however they do breakdown the sounds into "phoneme" -like parts for easier digital transmission. I wouldn't know where to get access to that though, maybe have a look as some cellphone SDK's .
posted by bigmusic at 8:53 AM on April 12, 2006

There are probably free solutions out there, but I've worked with the AT&T Natural Voices SDK and it will give you phonemes (and visemes) if you want that instead of audio. But you need to be technical to use it (since it's an SDK) and it costs $295. Also there may be licensing restrictions depending on your intended use.

The SDK has lots of nifty features including user dictionaries and multiple language support. But where it really shines is in converting regular text into phonemes intelligently. For example, "Mr. Jones" should be read as "mister jones" and "1964" should usually be "nineteen sixty-four" not "one thousand nine hundred sixty-four" and "30,000" should be "thirty thousand" not "thirty (pause) zero zero zero" and "3-5 items" should probably be "three to five items" not "three dash five items" or "three minus five items".

Making TTS sound good is just one of the inherent problems. In some sense, a TTS system has to "understand" aspects of the text: what is a date, what is a name, what is an acronym, what is implicit, what should be ignored, etc. These are things that a simple word-for-word dictionary lookup unfortunately will not provide.
posted by drew3d at 9:03 AM on April 12, 2006

Do you have a Mac OS X machine available? The developer tools (which come bundled for free) include an application called Repeat After Me, which (among other things) converts text to phonemes. It's at

/Developer/Applications/Utilities/Speech/Repeat After Me

I know nothing about phonemes, so I don't know if this will be useful to you. As an example, it converts 'My Generation' into

posted by chrismear at 11:13 AM on April 12, 2006 [1 favorite]

« Older What is the best "starter" digital camera for a...   |   Need some linguistic help... Newer »
This thread is closed to new comments.