Why are they "words" in English, but "root with suffixes" in Inuit?
Why are they "words" in English, but "root with suffixes" in Inuit?

I've often heard that the Inuit language is unlike English (or most other languages that I am vaguely familiar with) in that it doesn't really use distinct words to form sentences, but instead uses some root plus a bunch of suffixes to form a single word which would be expressed as a sentence in English.

For example, Wikipedia gives the word "tusaatsiarunnanngittualuujunga", meaning "I can't hear very well". This is formed of:
  • tusaa-: to hear
  • -tsiaq-: well
  • -junnaq-: be able to
  • -nngit-: not
  • -tualuu-: very much
  • -junga: I
This makes no sense to me. I don't understand what it is that makes these "not words". Or, equivalently, what it is that makes English sentences "not a word".

Why isn't it the case that there's something in Inuit called "tusaa tsiaq junnaq nngit tualuu junga" that we refer to as a "sentence"?

Why isn't it the case that there's something in English called "Icanthearverywell" that we refer to as a "word"?

Other than the obvious fact that somebody decided not to put a space between morphemes in written Inuit, I don't understand what the fundamental difference is.
Well, one thing to think about is that the sound of the morphemes (e.g. tusaa, tsiaq, junnaq -- the parts attached to meaning) change when they're combined. English words do that a small bit, but Inuktitut (not Inuit) words do that a LOT.

-tsiaq- means 'well', but you can see in the word that it changes to -tsiar- when it's combined with -junnaq-, and -junnaq- changes to -runnan- when it's attached to -tsiaq- and -nngit-. Most English words (e.g. baseball) don't change no matter what else is in the sentence.

Think of them kind of like our word endings, like our -s ending meaning 'plural'. We have 3 versions of this:

cat, cats -- "s" sound
leaf, leaves -- "z" sound
bridge, bridges -- "uz" sound

Is -s a word in English? No. It's a part of a word that adds meaning. It's a morpheme, and it changes depending on what it's attached to. Most of Inuktitut is made up of morphemes, not words.
(kind of a crap answer, but in any event, the structure of Inuktitut relies things like our -s, or pre-, or un-, far more than English does, and on separate 'words' far less)
The whole idea of "word" makes sense for English and many other languages that are "analytic," less so for languages that are "synthetic," like Inuit.

What's the difference? Well, there's really a continuum between analytic and synthetic extremes of language classification. But let's take things back a step. Here's what Wikipedia says:

In morphological typology (in linguistics), an isolating language (also analytic language) is any language in which words are composed of a single morpheme. This is in contrast to a synthetic language which can have words composed of multiple morphemes.

What's a morpheme? Again, from Wikipedia:

In morpheme-based morphology, a morpheme is the smallest linguistic unit that has semantic meaning.

The concept morpheme differs from the concept word, as many morphemes cannot stand as words on their own. A morpheme is free if it can stand alone, or bound if it is used exclusively alongside a free morpheme.

In essence, analytic languages are composed mostly of "free morphemes." English is pretty analytical, as things go. Look at the sentence:

My friends are in the school.

Pretty much everything is "free," except the "-s" that makes friends plural. That's a little bit of syntheticness that exists in English. There are other examples. "Him" carries within it the sense of an "object." It's really the same thing as "he," just with a narrowed usage.

In Hungarian, you'd have:

A barátoim az iskolában vannak.

Five "words" instead of six in English. But "word" for "word," this really is:

1) The 2) friends-my 3) the 4) school-in 5) they-are.

Hungarian is much more synthetic than English. In this case, morphemes that are "free" in English are "bound" in Hungarian. This includes possessive pronouns and many prepositions, including many of position, such as "in." If a quirk of Hungarian didn't necessitate the inclusion of definite articles, the sentence above would take half as many "words" in Hungarian than in English.

Many languages make plurals (a bit of synthesism in English) by doubling the singular morpheme. Instead of "friends," they say something the equivalent of "friend friend." Possibly, native speakers of such languages wonder why we do it in this crazy way - the same way you wonder about the Inuit, except the Inuit take it much further.

It seems natural to you that what makes a word plural in English (commonly, the suffix -s) isn't its own word. But in other languages it would be.

Another synthetic aspect of English are words like "unbrotherly." This means the same thing as "not in the manner of a brother," but we've just created the same sense synthetically:

un + brother + ly

"Un-" and "-ly" don't mean anything on their own, right? Aside from conveying a "sense" (of negation and manner, respectively), that is. So they're not words, they're bound morphemes.

It's really a question of degree. English speakers don't do it that much. Mandarin speakers do it even less, from what I understand . . . it's a more analytic language than English. Hungarians, Finns, Inuit and other speakers of certain languages do it much more.

The best way to look at it is in terms of morphemes. English has a lot of "free" ones, Inuit has a lot of "bound" ones. My native language, Bosnian, is a little more synthetic than English, but not by much.
But in English, we wouldn't hesitate to call "hear" and "hears" words, and obviously closely related ones at that, despite the fact that one is used and the other is not depending upon some of the other things used around them.

So I'm still not getting why we would hesitate to call "tsiaq" and "tsiar" words, and obviously closely related ones at that, because of the fact that one is used and the other is not depending upon some of the other things used around them.
While I agree that the sentence/agglutinative-word distinction seems a bit circular ("Why is it one word in Inuit?" "Because there are no spaces." "Why are there no spaces?" "Because it's one word."), there are criteria for determining whether something is one word or separate words.

For example, although you're right that a fluent English speaker would pronounce "I can't hear very well" as one fluid stream of sound without pausing between words, if you asked the speaker to slow down, he/she would (likely) utter the sentence with silences matching up with the word divisions, i.e. "I... can't... hear... very... well." But he/she probably wouldn't insert a silence between the two syllables in "very" -- maybe the vowels would be drawn out to emphasize them, but there wouldn't be any gap in the sound. (This is also true for words that are composed of multiple morphemes, e.g. "strawberry.") My understanding is that this is also the case in agglutinative languages -- someone speaking slowly will not insert silences between the separate morphemes.

Also, if you make a speech error in the middle of a word, you typically go back and repeat the entire word (you wouldn't just repeat the syllable you mispronounced), e.g. "I'd like an apripo-- apricot," rather than "I'd like an apripo-- cot." Once again, my understanding is that in agglutinative languages this is still the case (and I've heard this about a dialect of Inuit specifically) -- someone who commits a speech error while saying "tusaatsiarunnanngittualuujunga" has to go back and repeat the whole thing over again.

There are other tests, but I can't think of them right now.

So there are motivations for distinguishing between words and sentences. However, this doesn't answer the question of what the importance is of distinguishing between words and sentences, which is an entirely different issue. What does it mean that "I can't hear very well" is five (or six, if you expand the contraction) words in English but one word in (some dialect of) Inuit? I'd say "very little." It seems like the kind of thing people like to focus on in order to say "Look! Those crazy Eskimos have such a weird, inefficient language!"
So I'm still not getting why we would hesitate to call "tsiaq" and "tsiar" words, and obviously closely related ones at that, because of the fact that one is used and the other is not depending upon some of the other things used around them.

For the same reason that "-s" and "un-" and "-ly" aren't words in English. They have meaning, but they don't stand on their own. "Words" are essentially free morphemes.

But in English, we wouldn't hesitate to call "hear" and "hears" words, and obviously closely related ones at that, despite the fact that one is used and the other is not depending upon some of the other things used around them.

Well . . . again, "word" can be a meaningless distinction. You should read about the concept of lemmas, which explains the concept of roots to some extent. I speak a lot of languages and am always studying; it's easier for me to think of things in their lemma form, since the "add-ons" work so differently in everything language. To put it another way, "go" and "goes," are pretty much the same word to me, just inflected differently. In the simple present, English verbs have only two forms (except archaic forms and "to be." But in many languages - French, German, Hungarian - there are often six forms. It's just easier to think of them all as versions of the same lemma.
Re: isolating and synthetic. These terms are often used as opposites, but it might be easier to think of it as two independent axes. The isolating-agglutinative axis has to do with how many morphemes there are per word -- purely isolating languages have single-morpheme words, whereas agglutinative languages have many morphemes per word. The synthetic-analytic axis has to do with how many meanings you get per morpheme. In analytic languages, each morpheme means one thing (e.g. first person in one morpheme, singular in another). In synthetic languages, each morpheme can mean multiple things (e.g. first person and singular in one morpheme).

As for words vs. morphemes, you're right to wonder what exactly constitutes a word, which can be problematic. I don't know much about Inuit languages, but in some languages you can tell what's a word by looking at phonological features like stress -- each phonological word might have a single stress, regardless of how many morphemes are agglutinated to form it. There may also be syntactic evidence. Suppose, for example, there were free adverbs that could go anywhere in an Inuit sentence (like Quickly I left, I quickly left, I left quickly). If such an adverb couldn't occur at certain morpheme boundaries, that might be evidence that the boundary is inside a word. (Of course, it might also imply constraints on what words adverbs can occur between, as in *I met quickly him.)

Again, I am not an Inuit expert, but Inuit experts are, and it's unlikely they're just blindly assuming what's a word and what's not. It's a known issue.
But in English, we wouldn't hesitate to call "hear" and "hears" words

I would hesitate to call "hear" and "hears" separate words in English. Yes, if you're playing Scrabble, you can add the 's' onto "hear" for another point. But, they mean the same thing. By a (rapidly fading) quirk in English, we just have to add the -s ending if the subject of the sentence is singular and not the speaker. In fact, I might point you to the English spoken in my old neighborhood in Philly, where you might hear such an utterance as "I know he hear that baby crying." The meaning is not modified in the slightest, despite dropping the 's'.

As for what constitutes a word... have a look at German, which is not as synthetic as, say, Hawaiian, but is more synthetic than English. They're free to create truly ginormous compound words, using all(?) parts of speech, that then operate syntactically as a single word.
Without getting knee deep into linguisticky stuff, you're essentially trying to equate English syntax with Inuit morphology. Every language has both morphology and syntax systems (as well as other necessary components of language), but languages vary in how they use, and how much they use, these systems to allow speakers to express the ideas conveyed through the language.

Also, think about the words in an English sentence. You can rearrange those words to enhance meaning, change meaning, change focus, etc. Present day English has a pretty complex syntactic system. Word order matters. Conversely, English has lost much of what used to be a rich morphology system; look into the loss of case markings for examples of this. Now Inuit looks like it has a very complex morphology system, but not so much with the syntax. Your question essentially asks, "What if we took the elements of Inuit morphology and chopped them up and converted them to the syntax system that this other language uses (English)?" Put a complete other way, it's a lot like what happens when you try to print RGB (a light medium) screen images through a CMYK color printer (ink system/medium)...the spectrum isn't designed to be broken up that way, and you end up with some color values that get lost in translation.
But in English, we wouldn't hesitate to call "hear" and "hears" words

I would hesitate to call "hear" and "hears" separate words in English.

Don't confuse types and tokens. The sentence I hear what he hears contains five word tokens, but perhaps only five word types if you consider hear and hears to be the same type (which you are free to do). But if somebody said, "I have four words for you, my friend: I hear what he hears," he'd either be kidding or mistaken.
I'm wondering if you're less concerned about the linguistic definitions of the words "word" (paraphrasing Saussure - Linguistics would be great if we didn't have to use language to talk about it) and "roots with suffixes," and more wondering if it's a disrespectful way to put things.

Maybe I'm reading onto you but you sound defensive. If that's the case, don't be. "Roots with suffixes" is not a way of saying "primitive words." It's simply descriptive.
I was watching an episode of Psych and the two protagonists were trying to get the police to investigate the death of a popular sea lion. But the detective wouldn't normally care about an animal, so instead of telling him the name was "Shabby the Sea Lion" they told him "Shabby Thesealion" (pronounce that slightly french).

The point being, when you combine words in English you create a new word with a new meaning and new pronunciation rules.

I know some latin, so let's use that.

E: I gave to the girl a rose.
L: Puellae rosam dedi (1. girl-to 2. rose 3. gave-I)

Now. Could we write the English sentence with Latin rules? Sure:

girlto rose gavei.

But in English that hardly makes sense, because the language doesn't work that way. What we have are two new words, and depending on how you try to pronounce the last one, gavei sounds noting like gave-I.

What about the Latin as English?

I ded ae puell rosam.

Of those, only the last is a valid word in Latin; the rest is nonsense. Unlike English, even new words have to follow certain rules. While you could make the case for 3 new nouns in the nominative, I don't see anything that looks even remotely like a verb there. And despite its appearance, I does not mean the English I (for that there's a different word: ego).

My point is to illustrate what iamkimiam points out:

Without getting knee deep into linguisticky stuff, you're essentially trying to equate English syntax with Inuit morphology. Every language has both morphology and syntax systems (as well as other necessary components of language), but languages vary in how they use, and how much they use, these systems to allow speakers to express the ideas conveyed through the language.

An aside (having nothing to do with Inuit), interestingly enough, when writing English you need the spaces to help you figure out where one word begins and the other ends. For example:


Hard to parse, but after a couple false starts most people get it without much trouble. But maybe that is really a new word. A new drug to cure heart disease, perhaps. Without the spaces there's no way to tell.

Now take the Latin as it might have originally been written:


Kind of ugly, but easier to read. That's because I don't have to back track as much. By the time I get PVELLA-, I know I'm almost certainly dealing with the word for girl, and that only a certain number of letters combinations can follow before the word is finished! It's the rules. Same thing with ROSA- and DED-. I don't need the spaces to find the words.
On further reflection:

It's my understanding that most suffixes/infixes in Inuktitut do not have consistent meaning when uttered alone. In order to have meaning, they must be appended to roots according to the appropriate syntactic rules. A native speaker might be able to work out what you mean if you set up the context correctly, but your utterance would be broken (perhaps humorously).

Here's the equivalent in English:

"How informed are you about that topic?"


You and I can sort of work out the fact that the second speaker means that he is "uninformed"--that is, they essentially said "uninformed" but abbreviated it. This is highly ambiguous, though, and would be totally lost on a non-fluent speaker even if she knew the word "uninformed". You have to be so comfortable with the structure of the language that you can reinterpret the nonsense into sense.

But, if you answered the question "How was your day?" with "un", even the fluent listener would have no idea how to interpret your utterance. Not only have not answered their question, you haven't even produced something that can be an answer. No amount of context is going to turn your answer into something comprehensible. Only rephrasing your answer as "uneventful" (or the like) will clear it up--essentially wiping out your previous statement.

Contrast this to the following exchange:

"How do you like your eggs?"


In this instance, your utterance has clear meaning that needs context only to apply a subject. Furthermore, if somebody asks, "How was your day?" and you answer "scrambled", the listener might ask in what way your day was scrambled, but at least he's certain that the concept of "scrambled" does apply to your day. You can then say, "Oh, they just kept shifting the meetings around." "Scrambled" still applies; you've just informed the listener as to how it was scrambled.

I also think that Wikipedia may have done you a disservice in how they went about doing that deconstruction. Work entirely within English and apply the same rules as Wikipedia seems to have in their deconstruction of that word, and you can see how it rapidly leads to real problems by defining direct equivalences where none exist.

Let's take "antidisestablishmentarianism".

anti-: against
dis-: undo
establish: create or initiate
-ment: object of an action
-arian: member of group
-ism: a belief

But, "I'm swimming anti the tide" doesn't sound at all natural to us. While somebody might figure out you mean "against the tide", it sounds weird.

Nor does "I just deleted that file by accident, I've got to dis that," work even a little bit.

"When I'm done carving, I'll go jogging. But, first I have to finish this ment," took me forever to come up with, and it still sucks. 'Cause "-ment" isn't even *close* to a word for us. It's just a doodad that lets us refer to the general object of an action, when all we're concerned about is the action aspect. It essentially lets us noun a verb and discuss the object of the verb in the abstract. FedEx ships shipments because it doesn't care what's inside the box.

"ism" is interesting, in that we actually do recognize it as a word, but as a neologism. The Rastas, for instance, talk about avoiding "isms and schisms" (which is why it isn't "Rastafarianism"). I don't know how old "ism" as a word is, but I really don't think very long. You have to be pretty post-modern to talk about all belief-movements.

Even with the whole word itself, there are ambiguities as bad as Wikipedia's transliteration. Other than the general sentiment that it's "a belief that the people who want to undo something that was established are wrong", there's nothing at all telling us that, in this context, "establish" means "official and exclusive state support of a church". The word just simply does not apply if you're talking about disliking the idea of closing down a hardware store.

Furthermore, it's vital that you not take the Wikipedia example, which is almost certainly an easy one, as representative of the entire language. I'm certain that there are utterances much more difficult to break down into nice, easy groups for anglophones to read. I mean, for my example above, I picked a very long, obscure word made mostly of -fixes; it would have been a terrible rhetorical decision to deconstruct "bluebird" instead. Just as it would have been a bad choice for the Wikipedian to have chosen a word whose fixes couldn't be broken down into small, sensical English phrases.
Just a slightly different language design, with different demarkations between what is a word and what is a sentence. English is fairly verbose in that you just pick standalone words that you assemble into a sentence to make the point. Other languages combine "fragments" into "mega words" that you put in order to make the point. Or, just a different set of prefixes, roots and suffixes. English does this a lot with newer invented words like thermometer, telephone, automobile. Sort of.

Spanish does this with reflexive verbs, adding a -me to verbs when they are applied to oneself.
I wrote a bunch of that last night while on cold medication, so it may or may not make sense/be relevant to the discussion :)
