Alphabet Efficiency
July 10, 2008 10:53 AM Subscribe

Are any writing systems or alphabets more efficient, in the sense that they are more readable, than others? What is the most efficient?

I currently work in an office where the primary day-to-day working language is Arabic, which of course has a different alphabet than my native language.

On a page of written English, I can quickly scan and find any particular word I am looking for. I can't do this in Arabic, at least not very quickly. Obviously, this is because I don't know Arabic nearly as well as I know English and because I haven't been exposed to reading Arabic for more than a few years.

Still, watching my coworkers pore over Arabic pages with the same fluidity I read similar English pages has caused me to wonder whether, say, I am faster in English than my coworker is in Arabic. More specifically, I wonder whether there are certain alphabets or writing systems that lend themselves to faster or more efficient reading. That is, can readers of language X read faster, find words faster, etc. than readers of language Y, environmental, educational, and other factors being equal?

Relatedly, are phonetic languages more efficient than character based languages? And, has the evolution of writing systems generally moved in the direction of more efficient and quicker reading?

posted by ecab to Writing & Language (15 answers total) 7 users marked this as a favorite

Last week I saw, in Armenia (with its own curvy alphabet), a kid had arranged trash to spell out I love you in both Russian and English and I thought "Yeah, Cyrillic and Latin alphabets sure make it easier to spell out stuff. Spelling out stuff in Armenian would be a pain in the ass."

I then thought about writing SOS with coconut shells in Armenian, which would be a pain as well. Also there are apartment buildings here that are organized in the shape of CCCP (which spells SSSR). That'd be a pain to do with Armenian for sure.
posted by k8t at 11:00 AM on July 10, 2008

There's no way to evaluate that. Too many factors are involved, and too many of those are subjective.
posted by Class Goat at 11:03 AM on July 10, 2008

I disagree with Class Goat. There are people out there that study this. I looked around in Google Scholar for "script reading speed" and "script reading language" and found some articles. I imagine that someone out there can point you to some better resources though.
posted by k8t at 11:16 AM on July 10, 2008

There is lots of study on this and related topics. This article I quickly found pointed out that native Chinese and English speakers use different parts of the brain in order to process math. I also recall (but cannot find) a more recent article that compares the words for numbers in different languages, and how many languages are more efficient than English (and some less so) for naming numbers higher than 10.
posted by Cool Papa Bell at 11:22 AM on July 10, 2008

Just a thought, but I bet languages where each letter corresponds to one phoneme, not including dipthongs ect., are going to end up at the top of your list. Spainsh is the main example I am thinking of, rationale being that you can hear a word and immediately know exactly what it looks like, thanks to the one letter one phoneme scheme.
posted by The Esteemed Doctor Bunsen Honeydew at 11:35 AM on July 10, 2008

It shouldn't be impossible to evaluate, but I imagine it's way more complicated than the way you're framing it. For instance, with reading speed, how do you determine if the two readers are reading the same "thing" in order to compare?

Are they reading something that is semantically equivalent? The same number of phonemes? The same number of discrete written characters? The same number of words? You just could pick one of these, but it's no guarantee that the results would hold if you picked another.
posted by juv3nal at 12:07 PM on July 10, 2008

I've always thought Japanese has a very efficient writing system. It's because they combine three writing systems: Chinese characters (and some unique Japanese characters that have evolved), hiragana, and katakana.

They use Chinese characters as they are for nouns, proper nouns, and pronouns, then they're also used for the root part of adjectives, adverbs, and verbs. They use hiragana to conjugate the end of the adjectives, adverbs, and verbs. (So you see a character for the word itself, then a string of hiragana to conjugate it.) Hiragana is also used for particles and other grammatical markers, and sometimes to spell out words that, for whatever reason, are not popularly written using their Chinese character. Pronouns are sometimes written in hiragana as well. Katakana is used for foreign loan words, onomatopoeia, and anything you might want to italicize.

Anyway, those three writing systems look distinct on the page. It's quite easy to scan for something, because there are so many ways to do it.

If you're looking for a particular word, you're almost always scanning for a Chinese character. Unlike Chinese, where everything is written in these characters, the Chinese characters in Japanese are usually nestled between hiragana, so they really stand out as being a single word. Furthermore, unlike English where words only look different by merit of their length and a vague shape to them, Chinese characters look pretty distinct. One word usually does not look that similar to other words in a document. Even if there are some similar characters here and there, your eye stops to check far less often than it would in an English document, and it can gloss over far more.

Plus, some words are more than one character. You know how many characters the word you're searching for is, so you just look for clusters of that many characters. If the word you're looking for is two characters, it doesn't matter if there's a one-character word whose character looks similarly to one in your word. You go right past it. Then the longer the word is, the easier it will be to find because one- and two-character words are more common. If you're looking for a three- or four-character word, you'll find it really fast. So a lot of the search is just counting you don't even have to think about. Again, the amount of times you have to stop to compare the word on the page to the word in your mind is much fewer than in English.

But let's say you're not interested in a particular word: you're looking for stuff that happened in the past, or did not happen, or is likely going to happen, etc. For example, say you're looking at an article about a company, and you don't care about the history the article talks about, you just want to know what plans they have. You just scan the hiragana for the proper conjugation to find what you're looking for and read from there; it doesn't really matter what adjective, verb, or adverb it's attached to. In my experience, this is much more difficult to do in English.

Then katakana is typically used much less frequently than Chinese characters or hiragana, so if you're scanning for something that's in katakana you barely have to look at anything before you find it. Also, since it can be used to italicize, and such things are usually important, it sticks out just like italics in English.

Sometimes they use the Roman alphabet for loan words as well, so it's very easy to scan for that.

On top of that, hiragana and katakana symbols represent entire phonemes. In Japanese, if a word has "ka" or "chi" (or whatever) in it, you're looking for a single symbol that looks nothing like any other symbols. In English, you would be looking for two or three letters combined for that one sound, all of which are used in other combinations as well. Relatively speaking, this isn't that efficient.

One thing English has over Japanese is that they use spaces between words. However, the way Japanese is set up, you don't need the spaces anyway.
posted by Nattie at 12:24 PM on July 10, 2008 [2 favorites]

Salim Abu-Rabia and Linda S. Siegel, "Reading skills in three orthographies: The case of trilingual Arabic–Hebrew–English-speaking Arab children," Reading and Writing 16 (2003): 611-634 looks like it would be of great interest; unfortunately it's not free online, but you should be able to find a copy of the journal at a library. (Or maybe some nice MeFite has one and can report on the findings.)

Just a thought, but I bet languages where each letter corresponds to one phoneme, not including dipthongs ect., are going to end up at the top of your list.

It is really quite useless to speculate about this stuff. Results are often counterintuitive.
posted by languagehat at 12:51 PM on July 10, 2008 [2 favorites]

Thanks languagehat. I am associated with a university so I was able to get a copy of the article online and read it. It addresses language learning a bit more than the exact crux of my question, but is quite interesting nonetheless and, in any case, does shed some light.

Interestingly, it suggests:

Beaumont (1982) and Bradshaw and Nettleton (1983) suggest that words that appear in the right visual ﬁeld are identiﬁed more easily than words presented in the opposite direction. This is most probably true, as the language center of the brain is located in the left hemisphere so it receives more direct input from the right visual ﬁeld. Young and Ellis (1985: 14) posit that “the magnitude of right visual ﬁeld superiority increases with increasing word length.”

The article also mentions how English can be difficult because it's irregular in its pronunciation. But with regard to my question, I'm not sure this matters, since when I read I don't so much read the words as recognize them, reading the word as a whole. It seems like this might be more difficult in a language like Arabic or Hebrew, which are typically unvowelled when printed, because the actual word's pronunciation and definition has to be decided based on context.
posted by ecab at 1:34 PM on July 10, 2008

I am not an expert but do read and write both Hebrew and English fluently, was raised on both languages and move pretty well through both of them, and I can tell you that I can scan a page in Hebrew and find words very easily. But better than that, because Hebrew is structured on word roots (shoresh) you don't need to find the exact conjugation of that root to find what you are looking for. When scanning, the prefixes and suffixes fall away and you can scan for three letter roots, it is pretty easy to find things on a page, even ones with complicated layout and different script types (like a page of Talmud or the like) in Hebrew even without the vowels, punctuation and distinctions between characters (i.e. capitals or Italics) I once did an exercise for a Hebrew class I taught where I removed all the vowels and punctuation from an English essay, capitalized everything and had the students decipher it and find key words and phrase, it was very difficult. Did the same thing with Hebrew and they were much more quick to pick up on patterns and words.
posted by allfortheBoss at 3:43 PM on July 10, 2008

With respect to whether you use sound-meaning consistency in highly writing systems: there's a continuum of regular spellings with some being very 'deep' or 'irregular' (e.g., Chinese) others that are quite 'shallow' or 'regular' (Italian, Finnish), and English is somewhere in between. But no matter how deep the spelling of your language is, there's nice evidence to suggest that everyone, even very skilled readers, use spelling-sound decoding strategies when reading familiar words. Even Chinese. Here's a nice paper about it:

Zhang, S., Perfetti, C. A., & Yang, H. (1999). Whole-word, frequency-general phonology in semantic processing of Chinese characters. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25(4), 858-875.

Perfetti and others have looked at this stuff in other ways, and again, even in languages where there are only tiny cues about spelling-sound consistency, readers consistently use the information. What it means is that even though you've seen CAT a million times, you still activate the sound "cat" in your head, not just the meaning.

To answer your question about which is more efficient: this is a huge debate but my view is this (as someone who studies these things sometimes): there's always going to be a tradeoff between spelling-sound consistency and the complexity of the spelling system. Chinese is very efficient in that you can squeeze a lot of words into a small space, but learning that system is quite difficult and there's maybe more cognitive overhead involved. Finnish is probably a lot easier to learn to read because each letter has a sound associated with it, 1:1, however it takes up more space and arguably requires more effort to read because you're scanning your eyes across a larger space to view as much information. All languages are just a compromise between these two extremes.
posted by drmarcj at 4:05 PM on July 10, 2008

It seems like this might be more difficult in a language like Arabic or Hebrew, which are typically unvowelled when printed, because the actual word's pronunciation and definition has to be decided based on context.

Actually, I'm pretty sure it's just the reverse: because we read words as gestalts and there are fewer characters per word in Arabic and Hebrew, they are read more efficiently (as allfortheBoss says of Hebrew). I actually came here intending to post a link substantiating that, but I haven't been able to find one. I know I've read it, though.
posted by languagehat at 5:11 PM on July 10, 2008

the actual word's pronunciation and definition has to be decided based on context.

E.g., words like convict, read, impact, which all have at least two pronunciations (and meanings) depending on context.

Or words like bank, hit, pitch, which have multiple meanings depending on context.

But you're definitely right about word shape assisting in reading. A study I read showed that, with enough context, you can really blur the hell out of a word and people can still "read" it because of its descenders, risers, length, etc. Too lazy to look it up right now, but I believe it was in a cognitive science journal. Not positive of that, though.
posted by tractorfeed at 5:45 PM on July 10, 2008

I was pretty impressed with Hangul (the Korean writing system) when I discovered its history and how it works. Each symbol is actually a bundle of consonant and vowel components that describe how to pronounce the symbol. The system was created in the 1400s by linguists as an alternative to Chinese characters.

I don't actually read or understand Korean, but I became pretty familiar with the Hangul input system under Windows when I had to debug a Hangul input bug at work a few years ago...
posted by strangecargo at 12:14 AM on July 11, 2008

As Languagehat says, results on readability are often counterintuitive; being raised with a "mother" orthography will have an enormous bearing on one's ability to manipulate it quickly and efficiently. However, an important related question is the "objective" efficiency of a given orthography when confronted with technical limitations. Early machine-readable fonts needed to present characters to OCR systems with a minimum of confusion, say between "p" and "f," or between lower-case ell "l" and upper-case "I." The result is the kind of "computer writing" - actually Magnetic Ink Character Recognition - that we're used to seeing at the bottom of checks.

The challenge the Arabic orthography presents is the degree of visual similarity between individual characters: a taa, ت, is distinguished from a thaa, ث, by only one "dot." The corresponding potential for confusion thus increases along with the stylistic choices made by a typographer. You can see this in examples of traditional Arabic calligraphy, which at its most complex makes the text nearly unreadable.

What's so interesting about Arabic, though, is that on some level I think the language understands and works around this confusion when it becomes necessary to do so. If you ever drive through the Egyptian countryside, you'll see the work of bored military cadets on the sides of hills. They often painstakingly arrange rocks to spell out the shehaada, the statement of the Islamic faith that says "There is no god other than God [Allah] and Muhammad is his prophet." The Arabic for this - لا اله الا الله ومحمد رسول الله - requires no dots, and is therefore well-suited as a universally-readable message. Make of that what you will.
posted by awenner at 1:42 AM on July 11, 2008 [1 favorite]

« Older Best format for ripping CD's? | What to do in Providence? Newer »

This thread is closed to new comments.

Ask MetaFilter

Alphabet Efficiency
July 10, 2008 10:53 AM Subscribe

Tags

Share

Alphabet Efficiency July 10, 2008 10:53 AM Subscribe

Tags

Share

Alphabet Efficiency
July 10, 2008 10:53 AM Subscribe