Text to wav without limits?
May 30, 2010 8:41 PM   Subscribe

How can I change text into playable .wav files for minimal cost?

The back story here is that I want to take large blocks of text (book sized in html and pdf format) and then change them into several large .wav files. I think that this kind of an area where companies are still trying to squeeze out money for this service. I had a way of doing this with terminal on my mac (which was great), but the problem was there was a limit to the amount of text in the .wav file made by that method.

Ideally, I would like the books to be broken down into tracks, and put them on compact disks to listen to on my commute.

Is there a way around the limits imposed by mac's terminal on the size of wav files that it creates? That would be a great solution to this problem.

I have used a simple text to speech reader to speak my documents, while audacity recorded them. This works, but it is cumbersome and time consuming.

I would be willing to pay money for this service if the voice is high quality (such as AT&T natural voice), or if the fee is non-repeating (subscription services are not preferred).

Thank you for your help!
posted by candasartan to Computers & Internet (5 answers total) 2 users marked this as a favorite
 
Opera's text to speech (on Windows - on mac it uses the native text to speech service) is very high quality - I believe it uses an ATT natural voice. That coupled with Audacity should work nicely. On mac, you could use soundflower to route the audio into Audacity.

If you format the HTML with "breaks" several repeating words or phrases, you throw the whole text into one audio file and then break up the audio file by hand.
posted by Brent Parker at 8:48 PM on May 30, 2010


If you have a Mac, this seems like the easiest answer by far to me:

http://automator.us/leopard/examples/ex07/index.html
posted by crapples at 5:46 AM on May 31, 2010


Have you looked at Festival? I don't have a Mac but I know it works on Unix/Linux. You wouldn't need to wait around for it to render in real time: you just tell it where the text file is, and it produces the .wav file. If you wanted per-chapter tracks you would split your text file as necessary. Command line, so batch processing would be simple. Mbrola has more realistically sounding intonation but after a quick google it seems there might be problems with it on OSX. Both are free price-wise, only Festival is free source-wise.

If you have access to a Linux machine Festival or Mbrola definitely work. Ubuntu has nice pre-packaged versions, not sure about the other distros. Fedora used to have rpms if I remember correctly but there were dependency problems for me.

I would offer to do it for you - I use both quite regularly - but I'm online via my mobile so my upload speed is miserable and fairly unreliable. Soz :(
posted by blue funk at 10:53 AM on May 31, 2010


You shouldn't need third-party text-to-speech engines when OS X already has one built-in; the problem here seems to just be bending it to your will.

Is there a way around the limits imposed by mac's terminal on the size of wav files that it creates? That would be a great solution to this problem.

Maybe I'm a bit confused, but why not just split the text files into smaller chunks and then feed those into the terminal script?

Then your WAV files would automatically serve as "chapters" on your disc.

(What's the limitation on the WAV output size from the say command, anyway? It's surprising to me that it would be less than 700 MB.)
posted by bcwinters at 2:48 PM on May 31, 2010


Response by poster: Thanks everyone for all your help. In the end, it turned out that the terminal text to speech no longer works on my mac, but the automator can turn text into speech with no problems!
posted by candasartan at 9:54 PM on June 30, 2010


« Older I need places to get off the train or I'll spend...   |   Starting a startup? Newer »
This thread is closed to new comments.