Computer languages considered in linguistic contexts?
July 27, 2008 10:43 PM   Subscribe

Computer languages considered in linguistic contexts?

I've volunteered to give a talk about computer languages, for an audience which is primarily interested in linguistics, not computers.

One thing I'm definitely going to be talking about is Perl, which is unusual among computer languages in that, for instance, it has "pronouns". For more detail, see Larry Wall's "Natural Language Principles In Perl". Wall deliberately went against the grain of language construction, it seems to me, in that he didn't aim for perfection or concision, whereas most language creators would seem to be aiming for something more like a Philosophical Language.

Another example of a correlation -- I read in Pinker's "The Language Instinct" that English only has one first-person plural, "we", but other languages have more than one, which distinguish between "only you and I", "you and I and some other people", "some other people and I but not you" etc. Could that be considered analogous to "strong typing" in a computer language, where you can't have a variable be a number at one point, then a text string at another point?

I'm just looking for any interesting resources, links, opinions, ideas on the subject.

It's not an academic audience, and I don't have to rise to the standard of a formal lecture, just to be interesting. And I know more about the computer side than the linguistics side, so go easy on me if you're a linguist.
posted by AmbroseChapel to Writing & Language (19 answers total) 16 users marked this as a favorite
 
From one linguists perspective here...I'd be interested in hearing a speech that didn't necessarily make connections or claims about advanced features of a programming language, but a more general overview of some of the features of one (or more) programming languages. I'd want to know things about programming languages that I could apply and compare to spoken languages. What is the syntax like? Is it always SVO? What are some computer programming language universals? Are there any morphological variations in c.p. langs? What are some context-sensitive commands or features? How do these languages change and/or evolve? How are changes dealt with and or adopted by their "speakers"? How long does it take to acquire these languages? What does the 'family tree' for programming languages look like? Can comparative method be applied to tracing origins of terms, cognates, and the like? Vocabulary size? Mean length utterance (MLU)?

Dang, there's a million directions you could go with this. And I'm sure many papers and studies have already been written for you! I personally don't have any knowledge of this specialty in linguistics, but I don't doubt some other mefites can point you towards specific articles that could help you prepare a kickass, interesting speech! Good luck!
posted by iamkimiam at 11:00 PM on July 27, 2008 [2 favorites]


What kind of linguistics is your audience interested in?
posted by egg drop at 11:06 PM on July 27, 2008


You might look at how computers parse programming language vs how humans parse languages. Garden path sentences are a good jumping off point.
posted by Deathalicious at 11:14 PM on July 27, 2008


There seem to be two issues here: one is the syntax of languages and the other is the structure of languages. You seem to be focusing on the former while the latter might be similarly interesting (although how applicable to linguists I'm not sure). The (amazing) Curry-Howard correspondence, for example, suggests to me that computer programs themselves are philosophical languages that we've simply discovered.

Linguistics is also related to AI, so you could talk about prolog or other logic based languages. However, if your audience consists of Computational Linguists they'd know about this already.

Dijkstra: On the foolishness of "natural language programming".

-----
(More related to your perl example)
Early versions of AppleScript apparently included something similar.
posted by null terminated at 11:15 PM on July 27, 2008 [1 favorite]


(Also, you could spend hours just talking about Chomsky's contributions)
posted by null terminated at 11:17 PM on July 27, 2008


One idea that occurs to me is that programming languages have idioms that are particular to the language. I ran into this recently, learning Python from a C++/Java background. Initially, while learning Python, I was trying to say things that I knew how to say in Java, but the result was always awkward and not very pretty. Eventually, I started to pick up on the language's idioms and began to "think in" Python. At that point, I realized I was fluent in it. I think this parallels natural languages to a high degree.
posted by knave at 11:26 PM on July 27, 2008


You definitely need to look at Inform, which is used for writing interactive fiction. In particular, look at the example game for Inform 7. There's a white paper here [pdf] that explains some of the principles in detail.
posted by xchmp at 11:27 PM on July 27, 2008


I don't think is really that much similarity. If you wanted to talk about computers with linguistics people, talk about Context Free Grammars.
posted by delmoi at 11:28 PM on July 27, 2008


Could that be considered analogous to "strong typing" in a computer language, where you can't have a variable be a number at one point, then a text string at another point?

I wouldn't say so, but that's just personal opinion. I think as you examine the use of language (which I only really scratched the surface of when doing a bit of ESL) you realize that what seems like a casual language can have some very strict rule. I'd say strong typing is more analagous to rules that languages have about what kinds of words can show up where in a sentence. Oh! Even better...you know how sometimes people will use the same pronoun in one sentence to refer to two completely different things?

As in, "They looked at their hands, and they were clean." In a strongly typed language, you could argue, this would be "illegal" because you are putting hands into a "variable" that up to that point contained people. This sentence is actually easy to understand, but sometimes sentences can be difficult to understand because it isn't clear what the "they" stands for. Although now that I think about it the comparison doesn't totally work.

Computer langauges, even more flexible language ssytems like perl, are generally planned. However, in the same way that human languages evolve to accomodate the communicative needs of new generations, so do programming languages evolve new language parts to deal with new technological forces. So looking at how languages add syntax and vocabulary over time might be interesting.

I guess perl has some natural language in it, but I've always felt (as someone just starting to learn it) that perl has waaay too much punctuation in it. There are a lot of languages that sound and look a heck of a lot more like English, not that this is necessarily the point.

Oh, here's one thing...a lot of programming languages use SVO (subject verb object) in how they order tokens in a phrase...think [You] print 'hello world' or man.bite(dog). However, there are few programming languages that are VSO (+ 3 4, for example) just as there are few human languages that behave that way.

I'd be very interested in how language affects programming...in the sense that for many programmers, they are programming in a language with word tokens that are not their own... "if x then y" makes a lot of sense for someone whose native language is English, but for someone who isn't a native speaker, there will be a small layer of complexity.
posted by Deathalicious at 11:31 PM on July 27, 2008 [1 favorite]


I would look at theories of Mathematics as a language, since linguistically math and computer languages are quite similar. Math is part of natural language. 1 + 1 = 2 can be read aloud as a cromulent English sentence, just like you can read aloud a line of code. On the other hand math and computer languages have unique grammar and in their symbolic form can be read by people who speak different languages. It might be interesting to see if there are computer programs that can't be read in a natural language.
posted by afu at 11:57 PM on July 27, 2008


Hypertalk had pronouns, prepositions, ordinals, the definite article, all kinds of stuff.

eg
get the first word of card field 1
if it is "wombat" then get any word of card field 2
put it into myVariable
posted by w0mbat at 12:05 AM on July 28, 2008


I study software engineering and linguistics, so this is something I'm fascinated by. I asked a question about non-English programming languages once, it might have some ideas for you. There are some pointers there to wikipedia resources on Chinese, French and 'nonsense' based programming languages.
posted by jacalata at 12:08 AM on July 28, 2008


I don't particularly agree with your analogy to strong typing, but following up on Deathalicious' pronoun example, it might work better the opposite way- using the pronoun 'they' in two different contexts to illustrate dynamic typing, as the 'variable' isn't changing value, but is being interpreted in two different contexts. Since Perl itself is dynamically typed, maybe you could make a connection there?

There's always the Chomsky heirarchy. Maybe you could discuss the fact that programming languages have to be structured so that they can be unambiguously parsed/compiled and general parsing techniques for them. If you're talking to linguists I'd imagine they know the basics of those topics but never really put them into practice since they don't really work for English, so that might be interesting.
posted by version control at 12:13 AM on July 28, 2008


I don't know if you're familiar with the historical and philosophical development of computer programming languages (usually taught in the context of compiler design), and the way that ALGOL (iirc) was pretty directly inspired by the contemporary linguistic work on formal grammars as systems of rewriting rules. Regular expressions, for example, are Chomsky type 3 languages (except they aren't any more); ALGOL-derived languages usually aspire to be type 2 languages (context-free), though of course the grammars can't express a lot of pretty basic linguistic restrictions like variable type (or agreement of number, tense, etc).

Computer languages can be clustered into families by descent and similarity much like natural languages; the vast majority of modern languages descend from ALGOL (often via C). There are some other language families, like the LISP languages, and some relics still hanging on from the pre-formal-grammar days like FORTRAN and BASIC (though modern BASIC is pretty ALGOLy as I understand it). If you're looking for variation, you might also poke into corners like Prolog and FORTH (or PostScript or other RPN languages). The way that some languages completely eschew iteration for recursion would probably blow a few non-CS minds, if you can get the concept across in the talk.

Perl is interesting because it's essentially a creole. The traditional UNIX environment contains a number of domain-specific minilanguages, and encourages creating more. Larry Wall very pragmatically incorporated parts of many of these languages into Perl, despite the fact that the component languages sometimes had wildly different structures.

RPN languages are interesting in that they can express the same structures as the ALGOLish languages, but they do so in a way that's much harder for humans to understand, because we're not accustomed to maintaining deep stacks of referents. But in a sense they have no syntax, or at least very little syntax, which is a departure from syntax-heavy languages like C (or, God forbid, C++).

There are languages like AppleScript and COBOL and a bunch of database query languages from the '80s/'90s that attempt to mimic human language, but IMHO they all fail. Exploring the reasons for this would be interesting, I think, but out of scope for your talk. :)

I think Deathalicious' example is not an example of dynamic typing, but an example of what programmers would call scoping or binding: each 'they' refers to a different referent, according to the usual rules of English. Programming languages make use of limited scope as well, though scopes are typically nested in programming languages, instead of using rules like "nearest preceding applicable phrase" as in English. But on the other hand, there's (let*) or (letrec), not to mention (fluid-let) or Perl's local().
posted by hattifattener at 1:41 AM on July 28, 2008 [2 favorites]


I have read suggestions that some version of the Sapir-Whorf hypothesis holds for computer languages, which sentiment is expressed in the old saw "you can write Fortran in any language."

The impetus for so-called multi-paradigm languages is allow programmers to find a natural expression for problems. And you can find computer language connoisseurs who will say "this is a problem best solved in Haskell, whereas this is a problem best solved in C", in the same way that 19th century scholars might claim that Italian was the best language for poetry while German was ideal for philosophy.

Something else that might be of interest to linguists is the notion of namespaces, which can be elided in some progamming languages where things are clear from context, paralleling the way that in natural languages we can often omit things that are clear from context.
posted by i_am_joe's_spleen at 2:25 AM on July 28, 2008 [3 favorites]


IANAL. You might want to talk about ambiguity and context in computer vs. 'natural' languages vs. constructed languages like lojban (lojban.org) (namespaces, multiple inheritance, unspecified compiler behavior, etc...). Also, the use of redundancy in computer vs. human communication (unrolling loops for optimization vs. emphasis and clarity in spoken word).
posted by BrotherCaine at 3:52 AM on July 28, 2008 [1 favorite]


I do research on programming languages, on type systems in particular. The most beautiful connections between program languages and natural languages I ever encountered involve delimited considerations.

Chris Barker's paper on Continuation in Natural Language. He shows how continuations, a beautiful construct of programming languages on their own, can be used to elegantly capture certain phenomena in natural languages that are challenging to capture otherwise.

http://www.cs.bham.ac.uk/~hxt/cw04/barker.pdf

Chung-chieh Shan also has a paper on the subject,

http://arxiv.org/abs/cs.CL/0404006

posted by gmarceau at 4:15 AM on July 28, 2008 [2 favorites]


this sometimes crops up at lambda - if you don't read there you might want to search around.

the case of pronouns agreeing in number is closer to dependent types that strong typing in general. also, note how in a natural language typing and syntax tend to be interlinked, while in a computer languages they are usually orthogonal (perl is an exception to some extent).
posted by not sure this is a good idea at 7:13 AM on July 28, 2008


Type theory plays a huge role in the modern approach to semantics (i.e. generative semantics, as opposed to lexical semantics). This approach is based on the work of Richard Montague, and what has come to be known as Montague grammar, which relies heavily on the notion of type. It's a little different than the way you describe it in your question -- the type corresponds to a syntactic category, and words can be thought of as functions that return other functions, or (finally) truth values.

So the word 'smiles' (the verb, not the plural noun) can be thought of as a function that takes an argument of type "individual" and returns a truth value. The sentence "John smiles" can then be analyzed as smiles(John), which is the argument John applied to the function smiles, returning true just in case John does indeed smile, false otherwise. On this analysis, a transitive verb is a function that takes an individual and returns another function of the same type as 'smiles'. 'Likes' is a verb of that type. 'Likes' takes an argument of type 'individual', say, Mary, and returns a function 'likes Mary' which takes an individual, say John, and returns a truth value (true if John likes Mary, false otherwise).

The Barker abstract linked to above should give you a good overview; also this paper (by Mark Steedman, link is to a pdf) is a great "manifesto" on combinatory categorial grammar which should give you an idea of how computational linguists are looking at the issue.
posted by tractorfeed at 11:55 AM on July 28, 2008


« Older Suggestions for making a Linux boot flashdrive   |   American Made Soccer Shoes Newer »
This thread is closed to new comments.