Open Source: Foreign Language
September 11, 2013 3:20 AM   Subscribe

I am looking for an open-source / commons repository of any spoken language. Preferably free, otherwise just well-formatted.

I have a longish commute, and want to listen to language tapes on my way to work. I'm not really satisfied with typical language courses, and have been thinking of creating my own.

What I'm looking for is (Foreign Language, English) pairs of spoken language words, so that I have stripped out all the unnecessary chatter. It would be interesting if someone, somewhere, compiled such a set (maybe crowd-sourced with the help of other people; I don't care if it's different voices).

My main requirement is that the words be separable -- this is more of a vocabulary exercise than one for sentences or conversations. My secondary wish is that it be open or commons (but I would pony up the cost of a textbook for the audio tapes). Specific language is unimportant.

Do you know of such a dataset, or where I might get access to it?
posted by tintexas to Writing & Language (6 answers total) 9 users marked this as a favorite
You might want to explore the audio materials of the Rosetta project. I haven't really looked around there much myself, so I'm not sure it's exactly what you want, but it might be.
posted by lollusc at 3:28 AM on September 11, 2013

Response by poster: Ok, I lied. Primary requirement is that the language is one of the top 5 used in the 1st world (ie, what a youngster could learn in a very good highschool). stores some pretty fascinating stuff though!
posted by tintexas at 3:35 AM on September 11, 2013

Response by poster: It wouldn't be too bad to relax this into just foreign language singlets, actually. Find me an artist reading every word in French, alphabetically. [Ok! Done threadsitting!]
posted by tintexas at 3:48 AM on September 11, 2013

Although not specifically as you describe, duolingo is an amazing language learning tool that may rock your boat if you havent tried and rejected it. My GCE french has quickly become tolerably rather than intolerably bad.
posted by BenPens at 7:01 AM on September 11, 2013

posted by Dansaman at 7:21 AM on September 11, 2013 [1 favorite]

If your requirement is "Primary requirement is that the language is one of the top 5 used in the 1st world ", I think you are looking at Mandarin, Spanish, Russian, Portuguese, and Japanese. (I have assumed that the Middle East, Southeast Asia, and then Indian subcontinent are not your idea of "first world")

What I'm looking for is (Foreign Language, English) pairs of spoken language words

If your idea is language learning, this may not be a great idea because there is not necessarily a one-to-one correlation. For example, "apple" in Russian might be яблоко, яблока, яблоку, яблоком, or яблоке depending on the role the noun plays in the sentence. And in general, just memorizing vocabulary out of context doesn't do very much to advance language proficiency. But, if it is simply fun for you to hear someone recite "apple - 苹果", I suppose that is something you can do.

I second the recommendation for Forvo. I have found it very helpful in my language studies and you can also make requests for readings, so you could request a list of words to be read if you wanted. If so, please pay it forward and record a few readings yourself.
posted by Tanizaki at 7:48 AM on September 11, 2013

« Older Litte Miss Hospitality: Help Me Host Better...   |   My Magic Wand...Well, It's Not Working Newer »
This thread is closed to new comments.