Shortest distance between two tongues
August 11, 2008 4:52 PM   Subscribe

What are the closest distinct languages? I found myself involved in a discussion about this the other day, when we were trying to figure out if the Spanish-Portugese separation was greater than the English-Dutch separation. Then, someone else brought up the Swedish-Norwegian pair, and we all ended up rather confused.

Now, obviously, different dialects of the same language are really, really close. And, I suppose that creoles are not that far away from their constituent tongues. I'm thinking about distinct languages that (ideally) have their own literature, and such that the spoken and written language would have to be translated to be understood by speakers of the other language. I believe this eliminate Swedish and Norwegian (and I wish I had thought of that the other day, during the original discussion).
posted by math to Writing & Language (53 answers total) 10 users marked this as a favorite
 
My Spanish teacher (a native) talked (at length) about how when she traveled to Italy, she spoke Spanish to them, they spoke Italian to her, and they understood each other 98% of the time. I guess that kind of breaks your "such that the spoken and written language would have to be translated to be understood" criterion, but that's kind of my point.
posted by fogster at 4:57 PM on August 11, 2008


The Hindi/Urdu pair is another good candidate... But this discussion is doomed to go in circles, since any set of criteria for "closeness" is bound to be arbitrary.
posted by mr_roboto at 5:02 PM on August 11, 2008


Bulgarian and Macedonian are very, very close: you could easily travel and live in either country if you speak one of those languages. Bulgarian television channels will interview Macedonian celebrities without utilizing a translator or putting captions on screen, and vice versa, for example.
posted by halogen at 5:06 PM on August 11, 2008


Latvian and Lithuanian? Slovene and Serbo-Croatian?
posted by scody at 5:06 PM on August 11, 2008


I think that Spanish / Italian thing was *way* overstated; there's simply not anywhere near that degree of overlap, despite their common origins.

But in any case, their won't be an answer to this. I can think of plenty of examples where different languages are more mutually intelligible than different dialects of the same language.

I grew up speaking Serbo-Croatian - one language. Now it's three languages - Serbia, Croatian and Bosnian. There are now dictionaries for all three languages and each language has its "own" literature, when these didn't exist fifteen years ago. But the differences between these languages are much less acute than between the English(es) that the average Texan and the average Scot speak.

What a "language" is - as evidenced above, is very arbitrary. I can think of loads of similar examples, such as Moldavian / Romanian. And I can think of examples where most American English speakers would be hard-pressed to understand "their own" language, like on the streets of Kingston.

It's a fun question, but for any answer you might arrive at, there are a million exceptions which lay truly in history and the arbitrary more than in fact of language. It's one of the reasons why one only hears estimates of the number of languages on Earth . . . there's not enough agreement on the definition to allow for a distinct number.
posted by Dee Xtrovert at 5:09 PM on August 11, 2008 [2 favorites]


Oh, and there's the trio trio of Irish, Scottish, and Manx Gaelic.
posted by scody at 5:10 PM on August 11, 2008


Oh my God . . . it should be "THERE won't be an answer to this." Sorry, I'm usually a good speller.
posted by Dee Xtrovert at 5:10 PM on August 11, 2008


(that's a double-trio, for those of you scoring at home.)
posted by scody at 5:10 PM on August 11, 2008


English and American.

Seriously, the only real way to answer this question would be to somehow quantify language differences. if you can come up with a metric for vocabularies and grammars then you could calculate differentials. You could probably also get a Linguistics PhD thrown in for good measure.
posted by GuyZero at 5:10 PM on August 11, 2008


Paging languagehat...

I remember my linguistics professor mentioning that Scots and English have very similar grammar and moderately similar morphemes and phonemes but distinct and nearly unintelligible lexical structures. This pair satisfies your "needs-translation" requirement—I know I can't quite understand Robert Burns and Irvine Welsh unless I have a Scots glossary handy. The work of Burns, Welsh et al also satisfies your "own literature" requirement.
posted by infinitewindow at 5:12 PM on August 11, 2008


german and dutch are very close. (the german word for german is deutsch, which sometimes causes confusion.)

most germans can guesstimate their way through a dutch conversation. it kind of sounds like the other person has a heavy accent and is a bit tipsy (okay and drives a caravan). if a person speaks slowly enough, I'll get the basics. I suppose the dutch find german equally understandable yet quirky.

there is a northern-german dialekt called Platt or Plattdeutsch from which english is derived. there are close similarities between the two still present. go to rural Schleswig-Holstein and you'll hear this dialekt spoken all around, any Hanse-city will do as well.
posted by krautland at 5:14 PM on August 11, 2008 [2 favorites]


When I was a university student in Brazil the bulk of the assigned reading was in Spanish rather than Portuguese. Obviously if a work was available published in Portuguese we read it in Portuguese; if not, we all did fine reading the Spanish text. I think that defeats your "written language would have to be translated to be understood" criterion.

Spanish spoken on Brazilian TV was sometimes subtitled, sometimes not, depending on the regional accent and education level of the speaker (Bolivian coca farmer = subtitled, Argentine Finance Minister = no subtitles.) Spanish varies so widely in its spoken form, as does English.

On the other hand, a Brazilian friend of mine went to Spain to deliver a lecture, laboriously translated his talk into Spanish, and after his presentation was complimented on how easy his Portuguese was to understand, in comparison to Continental Portuguese. He was crushed.
posted by ambrosia at 5:14 PM on August 11, 2008 [1 favorite]


It depends on what criteria you use to define closeness. like mr_roboto mentioned, Hindi and Urdu are very similar (I speak some Hindi, and can easily understand my Pakistani friends when they converse amongst themselves), but their writing systems are completely different, as Urdu was heavily influenced by Arabic/Persian dialects and uses that script, while Hindi was influenced by Indic languages and uses the Devanagari script. Vocabularies of both languages have also been shaped by their respective influences.
posted by shoebox at 5:14 PM on August 11, 2008


I was a linguistics major. The unfortunate answer is that this is an unanswerable question, largely because the definition of "language" isn't solid. The old cliche that a language is a dialect with an army is largely true. For example, there are Scottish nationalists who claim that there is a Scots language-- not the well-established Scots Gaelic, but a language related to English that is separate enough to be called its own language.

Another example-- Romanian and Moldovan are essentially the same language. Mutually intelligible, percentage of shared lexicon over 99%. However, Moldova was absorbed directly into the Soviet Union and uses the cyrillic alphabet. Some people list them as separate languages.

So the answer is going to depend on who's giving it and what they consider separate languages-- because there isn't a consensus.
posted by Mayor Curley at 5:19 PM on August 11, 2008


Indo-European languages have a fantastic family history and encompass a surprising number of languages. The family tree [SVG] gives great insight into the relationship between them.

It however, doesn't answer your question: which siblings are closest. Nor does it address the other language families (where Chinese and Arabic fall for instance).

A second page for languagehat...
posted by pedantic at 5:19 PM on August 11, 2008


Oh, and Yiddish and German. When I was traveling in Austria with my college roommate, who was reasonably fluent in Yiddish, she could understand most of what was being said. (She didn't actually speak Yiddish to anyone too see how closely it went in the other direction; having had ancestors die in concentration camps during the war, she just couldn't bring herself to do it.)
posted by scody at 5:22 PM on August 11, 2008


A curious note: Norwegian is to Icelandic as Swedish is to Danish. Norwegian and Swedish seem to be further apart than what we would assume.
posted by pedantic at 5:23 PM on August 11, 2008


Dee Xtrovert brings up a good point about Romanian and the language spoken by Romanians in Moldova (referred to as Moldavskij by Russians and Ukrainians, hence the anglicized Moldavian).

Most Romanian-language papers in Moldova refer to their readers as basarabenii, the bessarabians, to distinguish them from the russified Moldovans and Russians who also live there.

The differences between the language spoken by Romanians in Bucharest and Chișinău are almost exclusively orthological: pâna (RO), pîna (MD) both mean “until” and pâine (RO), pîine (MD) mean “bread.” The only difference are the spelling reforms determined by the Academia Româna, which are followed with greater attention in Romania than in Moldova.
posted by vkxmai at 5:24 PM on August 11, 2008


I guess there's no definitive answer but I have another couple of examples: Galician and Portuguese. Frisian and Dutch.

Having said that, Dee Xtrovert makes a great case with Serbian, Croatian and Bosnian...
posted by ob at 5:25 PM on August 11, 2008


Oh, I just had a thought inspired by vkxmai's comment: are Dutch and Flemish counted as separate languages? If so they're pretty damn close, despite what my Dutch friends say...
posted by ob at 5:29 PM on August 11, 2008


Dee Xtrovert and the Mayor nail it. It's a fun question to toss around, but it's unanswerable. (For one thing, the distinction between "dialect" and "language" is entirely arbitrary.)

But "she spoke Spanish to them, they spoke Italian to her, and they understood each other 98% of the time" is typical of the kind of nonsense you hear when people toss the question around. Sadly, only linguists know enough about languages to give an informed answer, and all they'll tell you is that it's unanswerable!
posted by languagehat at 5:30 PM on August 11, 2008 [1 favorite]


Ah, upon review I see that they're not classified as separate languages, so please ignore that last one...
posted by ob at 5:31 PM on August 11, 2008


I have no idea. It's daunting with the sheer number of families, let alone the languages within families.
posted by pedantic at 5:31 PM on August 11, 2008


Another one for consideration: Polish and Ukrainian.
posted by yclipse at 5:35 PM on August 11, 2008


Hindi, Urdu, and Bengali all came from Sanskrit. If you grew up in pre-war Bangladesh you'd likely understand them all because you'd have heard them all, and they are kinda similar.

Bahasa Indonesia was originally a Malay dialect.

My boyfriend knows Danish, and he mentioned that Swedish was easier to understand by hearing but hard to read, while Norwegian was the opposite. Or I may have flipped the two.
posted by divabat at 5:38 PM on August 11, 2008


Norwegian, Swedish and Danish?

"Generally, speakers of the three largest Scandinavian languages (Danish, Norwegian and Swedish) can read each other's languages without great difficulty. This holds especially true of Danish and Norwegian. The primary obstacles to mutual comprehension are differences in pronunciation. Danish speakers generally do not understand Norwegian as well as the extremely similar written norms would lead one to expect. Some Norwegians also have problems understanding Danish, but according to a recent scientific investigation Norwegians are better at understanding both Danish and Swedish than the Danes and Swedes are at understanding Norwegian. Nonetheless, Danish is widely reported to be the most incomprehensible language of the three. In general, Danes and Norwegians will fluently understand the other language with only a little training."


Finnish and Estonian?

This thread may be of interest.
posted by iviken at 5:39 PM on August 11, 2008


American and English.
posted by blue_beetle at 5:44 PM on August 11, 2008 [1 favorite]


"The old cliche that a language is a dialect with an army is largely true."

I believe that to fulfil Weinreich's criteria properly, you need a navy as well. Which is why Swiss German is still just a dialect...
posted by i_am_joe's_spleen at 6:00 PM on August 11, 2008


Wikipedia has a mutually intelligible languages page which points to articles that shed light on this question.
posted by i_am_joe's_spleen at 6:03 PM on August 11, 2008 [1 favorite]


Others beat me to the 'dialect with an army' line, though in some cases, that's not proved true: standard German is not Prussian, nor is standard Italian Piemontese.

I think the best answers to this precise question -- the languages closest to mutual intelligibility without being so -- focus on orthography. Hindi and Urdu, Romanian and Macedonian, etc. -- you might be able to understand what another speaker is telling you, but you wouldn't necessarily be able to read it.
posted by holgate at 6:43 PM on August 11, 2008


Dutch and Flemish
posted by pompomtom at 7:06 PM on August 11, 2008


The canonical (well, according to David Crystal) closest-to-english-that-is-not-english is Frisian. Tok Pisin might qualify too, as it's gone way beyond a typical pidgin.

Hamburg and Glasgow dockers were fabled to be mutually intelligible, to the likely exclusion of all others.
posted by scruss at 8:05 PM on August 11, 2008


Just to add another pair into the mix: Afrikaans and Dutch. They were the same language 300+ years ago but developed separately. Dutch people often comment that Afrikaans sounds like a very old form of Dutch.
posted by Gomez_in_the_South at 8:29 PM on August 11, 2008


you might be able to understand what another speaker is telling you, but you wouldn't necessarily be able to read it.

Conversely, as a Latvian, I can more or less follow written Lithuanian, but can barely make head or tail of it when I hear it spoken.

But your question has bigger issues. First, how to define distinct languages, as opposed to dialects? And second, by what criteria can you say that one pair of languages is more or less close than another pair?
posted by UbuRoivas at 8:36 PM on August 11, 2008


The canonical (well, according to David Crystal) closest-to-english-that-is-not-english is Frisian.

I spent some time in Ostfriesland years ago. People there had heard the notion that "Frisian is really close to English", too. Occasionally someone would offer you something printed in Friesisch to see if you could understand it.

To most English speakers, it might as well have been Linear A.

(There are some related words, of course, but they really don't stand out any more or less than they do in Dutch, or even German.)
posted by gimonca at 8:51 PM on August 11, 2008


there is a northern-german dialekt called Platt or Plattdeutsch from which english is derived

Not really. Platt (like Dutch) is still more a first cousin to English, not a direct ancestor.

(In English, "Platt" is generally called "Low German".)
posted by gimonca at 9:03 PM on August 11, 2008


the only real way to answer this question would be to somehow quantify language differences. if you can come up with a metric for vocabularies and grammars then you could calculate differentials.

Non-snarky question: If linguists don't do this, why not?
posted by lukemeister at 9:26 PM on August 11, 2008


Non-snarky question: If linguists don't do this, why not?

Thousands of variables, many of them predicated on perception and biases. A couple of generations ago, some folks would have done this happily, only to have their work sneered at by modern linguists. Linguistics with a firm scientific approach is rather young, but it's aware of its current limitations. Which is exciting because many of the Big Discoveries are still ahead of us.
posted by Mayor Curley at 9:36 PM on August 11, 2008


I spent most of my graduate work on designing elicitation corpora that could be used with any human language. I spent a much of my time looking at lots of languages and trying to figure out how they are different grammatically from one another.

Now, there are a lot of techniques that one could use to determine whether two samples contain the same language. Given to a computer, most of these techniques are completely abysmal, though they fare much better if a human does the analysis. There is just something about the human ability for pattern recognition that computers can't yet match.

The big problem is that even among people who speak the same language and same dialect, there will be great variation based on age, education, gender and class. This complicates matters greatly, as it can be difficult to find field data in the first place for many languages, let alone data from informants with matching backgrounds. Talking to multiple informants is one way to compensate, but this often depends on money and availability.

I will tell you that in my professional opinion the line between a dialect and a distinct language is rather arbitrary. Some of the examples above make that point quite clearly and using languages that most of us have heard of. However, it is not hard to imagine that there are even better examples among the smaller, poorly documented languages of the world. According to Ethnologue, there are about 6,900 living languages on this planet.

Of those, about a quarter can be found in Papua New Guinea and Indonesia. Most of these languages can be found in areas with rough country, places where there aren't enough field linguists to do in-depth studies of each one. With the sheer number of languages we would require 50 Kenneth Hales to cover the gap, and even then it's possible that at least two Kenneth Hales are studying essentially the same language but their data is different enough for them to overlook that fact. Even when comparing relatively rigind Swadish word lists, its possible that informant accents/word choice or phoneme interpretation is responsible for the difference.

So really, there could be two languages out there classified as different, when they are essentially the same due to the fact that there are lots of people studying these languages and it's hard for everyone to be consistent with one another. So, I guess the shortest possible distance between two tongues could be zero.

For further reading there are two books that have a lot to say on this subject. The first is the excellent Dialectology by Chambers and Trudgill. It goes in depth in explaining the difference between a dialect, a creole and a language (the answer is that the line is arbitrary). The second is Language Death by David Crystal. While the main subject of this book is quite fascinating, the first chapter lays out the case for belief that there are about 6000 human languages spoken in the world today, and not say, 500 or 20,000. This book is a few years old and probably did not have the benefit of the latest Ethnologue data, but its reasoning is quite solid.
posted by Alison at 9:37 PM on August 11, 2008 [4 favorites]


In an ideal world, if I wanted to measure the mutual intelligibility between two languages I would do the following:

1. Assemble a corpus of 20-50 news articles in language A and one of the same size with the same type of source material in language B.
2. I would give language A's corpus to 50-100 speakers of language B and have them translate it into their native language/dialect. I would ideally make sure that the language B speakers had no formal or informal training in language A. That might be impossible.
3. I would to the same for language B's corpus and language A speakers.
4. I would compare all of the translations for similarity using Leave-one-out cross-validation, possibly using a Machine Translation scoring metric like Bleu or Meteor.
5. Now I'll have distribution of scores for my translators on either side. Hopefully, I'll have done this with lots of data that has already been classified into language vs. dialect so that the scores will mean something.

This scenario presumes that language A and B have writing systems and native language newspapers, which is not true for a big chunk of the world's languages.
posted by Alison at 9:50 PM on August 11, 2008


Oh, and Yiddish and German.

Except that Yiddish has imported many, many words and expressions from Hebrew and Aramaic, as well as other European languages. Also, the Southeastern variety of Yiddish has a vowel shift that renders it very different from German, phonologically speaking.
posted by greatgefilte at 10:06 PM on August 11, 2008


The linguist April McMahon (at the University of Edinburgh) seems to be one researcher at the center of current attempts to quantify and classify language in a formal way. Most of her work, however, has been in the realm of English dialect.

Just a data point. I'm not sure you can ever answer this definitively, but some methodology may in the future achieve common acceptance. In her own book she says that confidence largely rests on the method and the creator fo the method, and in individual linguists' command of multiple languages. This subjectivity is what her work seeks to overcome.
posted by dhartung at 10:21 PM on August 11, 2008 [1 favorite]


Many people above have said that the difference between a dialect and a language is arbitrary. I think this claim could be easily misunderstood. This line/difference between dialect and language is NOT objectively arbitrary...meaning that its not as if there were a bunch of points at which we could collectively decide "this makes it dialect" or "this classifies it as a distinct language", and people just randomly picked and agreed upon a point, and then considered the matter settled based on arbitrarily chosen criteria.* The distinction is arbitrary in the sense that each linguistic region is influenced by many things, including but not exclusive to...social, economic, and political power, geography, linguistic evolution, presence or lack of orthography and writing system, and phonological, syntactic, and morphological variation. These are some of the variables involved in defining language vs. dialect, and they influence each linguistic area in different ways and amounts. It's not a conscious process either. It just happens as groups change, evolve, use language, and define themselves or are defined by others. In this sense, the criteria used to make distinctions seem arbitrary if you compare groups**; we can't predict which group is going to be influenced by what factor and how they will decide to draw lines.

The OP's question is unanswerable on so many levels. But I'd start with the faulty premise of a universal criteria for determining the distinction between language and dialect. Also, a writing system (orthography), if a language even has one, is not a good metric for comparison of languages, especially for determining relatedness or mutual intelligibility. That'd be a weird sort of reverse engineering (which, yes, is what historical linguists do...but would not be a fitting application for getting to the answer of this type of question).

The biggest thing to remember here is that there are often visible reasons why linguistic varieties (languages, vernaculars, dialects, etc.) create, keep, change the boundaries that they do. Sometimes these varieties want to be a dialect of a common language for political reasons, solidarity, economic or social inclusion—sometimes its the complete opposite. There is so much going on here, it can't easily be quantified or compared.

I like to think of languages as children...some have the same parents, some different ones, some are raised together, some apart, some in isolation, some are neglected, some are privileged, some have prestige and social power, some are educated, some are in poverty, some are dying. All grow and change over time; it can't be helped! Now, how does each analyze their own experience, and those of others, and agree on the definition of what constitutes the "family" and "closest living relatives"?

*I'm not saying this is what other posters are suggesting; I just want to make it clear that this would NOT be the factually correct interpretation of phrasings such as, "the distinction between a language and a dialect is entirely arbitrary."
**Example: "Well, they call this thing a dialect and if they did that over here, we'd call it a language, because crap! we just can't understand them!"

posted by iamkimiam at 11:03 PM on August 11, 2008 [1 favorite]


As an analogy: this is like asking what the most similar distinct colors are. Vermillion and crimson are pretty similar, but do they count as different colors, or are they just different shades of the same color?
posted by painquale at 4:03 AM on August 12, 2008 [1 favorite]


The various "dialects" of Chinese?

From my experience living with a bunch of Nordic classmates: Norwegians understand Swedish and Danish with less difficulty, with (spoken) Danish requiring more effort (oncea Swede commented that trying to understand Danes are like trying to understand a drunk person with a potato down their throat).

Norwegian TV has quite a lot of un-subtitled Swedish and Danish programs, and Norway being a smaller country means that they are more affected by their neighbours than vice versa. Written Danish and Norwegian (bokmal) are close, and only Swedish has those weird letters with umlauts (ä, ö). Swedish-Danish couples I know have a greater problem understanding each other.

Icelandic is written like really old Norse, and no one else understands it when spoken! Finnish shouldn't be counted at all of course, but Finns I know have a strange affinity for Hungarian (they pick it up quickly) and the other way round as well.

Honestly there's no estimable answer to this question.
posted by monocot at 7:10 AM on August 12, 2008


FYI, Frisian is close to Old English not Modern English. In the documentary Mongrel Nation, Eddie Izzard learns how to say "I want to buy a brown cow" in Old English and heads over to Friesland.
posted by nooneyouknow at 7:52 AM on August 12, 2008


As someone else has noted above, Polish and Ukrainian. My Polish ex-gf was able to watch and understand Ukrainian programming, but couldn't read its Cyrillic alphabet.
posted by Chuckles McLaughy du Haha, the depressed clown at 8:05 AM on August 12, 2008


FYI, Frisian is close to Old English not Modern English.

That seems to over-promise as well. Old English is highly inflected, my impression is that Frisian is not.

Frisian is related to English (Old or otherwise) in ways that aren't going to be terribly obvious at a casual glance. Garden mint and teak trees are closely related, too, but it took modern genetics to tell people that.

Anyway, you can create single sentences in everyday Dutch (or German, or even Swedish) that English speakers would probably understand, if you're willing to cherry-pick the vocabulary aggressively. That's different from sitting someone in front of a TV in the other language and asking if they can understand what's being said in general.
posted by gimonca at 8:56 AM on August 12, 2008


The various "dialects" of Chinese?

They're no more similar than the Romance languages. Please at least try to know what you're talking about before answering, although in the case of this question, WHICH IS UNANSWERABLE, it really doesn't make much difference.
posted by languagehat at 10:03 AM on August 12, 2008


"But "she spoke Spanish to them, they spoke Italian to her, and they understood each other 98% of the time" is typical of the kind of nonsense you hear when people toss the question around."

When I was in Italy, I spoke English to people and they spoke Italian to me, and we understood each other 90% of the time. Ergo, Spanish is only 8% different.
posted by klangklangston at 12:01 PM on August 12, 2008


Finnish shouldn't be counted at all of course, but Finns I know have a strange affinity for Hungarian (they pick it up quickly) and the other way round as well.

Finno-Ugric
(Estonian's there too).

posted by ersatz at 12:26 PM on August 12, 2008


scody She didn't actually speak Yiddish to anyone too see how closely it went in the other direction
I have it all the time that I pick up fragments of a conversation on the street, think that it was german and then it turns out to have been yiddish. they are very close. also: we use a lot of words exactly the same way. think Mensch.

Norwegian, Swedish and Danish?
my danish roomie claims he can understand swedes and norwegians but that it's not all that easy and that he knows a lot of fellow danes who have a very difficult time doing this. he seemed to think it depended on just where in denmark they were from. according to him, swedes find danish a tough nut to crack.

oh yeah, and finnish is just like klingonian to him.

gimonca: you're right, thanks.
posted by krautland at 4:16 PM on August 12, 2008


Speaking only from my own limited experience here.

Knowing Polish, I can understand Czech well enough, and knowing Swedish allows me to communicate with Norwegians and Danes, and only to a certain extent Icelanders. (although one year in Iceland helps - they're better at understanding Swedish than vice versa)

This all depends a lot on the accent of the speakers. Southern and northern Swedes can have more difficulty communicating than south Swede & Dane, or western Swede & Norwegian.

Reading Norwegian is simpler than Danish, and drunk Danes (or those from north-west) usually compress their pronunciation into small chunks of sound that lodge in your brain and die.
posted by monocultured at 4:40 PM on August 12, 2008


« Older how do I teach the cat my room is taboo?   |   how much does heating cost in nyc? Newer »
This thread is closed to new comments.