Help me understand natural language processing
April 3, 2007 4:39 PM   Subscribe

As a sequel to my last question, I'm looking to dig deeper into HCI. Specifically, I'd like to learn about natural language processing.

I'm looking for some good introductory reading material on natural language processing. What I've read so far (some math heavy stuff about Markov chains) has been a bit above my current level of understanding, so I'm turning to the community to recommend the best of the best to get me started. What books or online resources would you, as someone with knowledge in the area, send a person away with to get a solid foundational understanding of current NLP theory?
posted by saraswati to Computers & Internet (6 answers total) 9 users marked this as a favorite
 
Best answer: Two that are good are: Manning and Schutze, foundations of statistical natural language processing, and Jurafsky and Martin, speech and language processing. There are plenty of others but these are two that I know are good, and each of them should contain at least some more approachable material. These might be less introductory than you want, though, as each would be usable in a graduate level course.

There's also an earlier askmefi question that was somewhat broader but perhaps relevant here.
posted by advil at 5:47 PM on April 3, 2007


Best answer: i come from the cognitive psychology side of language processing, where we've used townsend & bever's "sentence comprehension" book in graduate seminars.
posted by noloveforned at 6:20 PM on April 3, 2007


The most interesting aspect to it is that there is no perfect general solution.

That is proved by the existence of puns. A pun is a word, phrase, or sentence which, in context, can be interpreted in two or more ways which are relevant to the situation. If humans can't even tell which is the correct interpretation, then how could an algorithm decide?

Here's an example, from Burns and Allen:
George: (looking at Gracie, who is arranging a large vase of beautiful flowers) Grace, those are beautiful flowers. Where did they come from?

Gracie: Don't you remember, George? You said that if I went to visit Clara Bagley in the hospital I should be sure to take her flowers. So, when she wasn't looking, I did.
How is a computer to tell whether Gracie's interpretation of George's imperative is the right one?
posted by Steven C. Den Beste at 6:57 PM on April 3, 2007


How is a computer to tell whether Gracie's interpretation of George's imperative is the right one?

A computer couldn't do it perfectly, but it could certainly do it as well as a human given sufficent contextual data, such as: the general propensity for people to steal flowers from people in the hospital, George's likelihood of encouraging larceny, etc.
posted by juv3nal at 9:20 PM on April 3, 2007


That represents an extremely high level of semantic processing, well beyond anything which is possible today. It requires a true artificial intelligence. That's well beyond "natural language processing".
posted by Steven C. Den Beste at 9:43 PM on April 3, 2007


Best answer: Start learning about linguistics. I am in a graduate NLP class right now, and the class is structured by what current NLP technologies can do at different levels of the synchronic model of language.

For our text, we are using the Jurafsky and Martin book that advil mentioned. My professor says that she prefers to use the Jurafsky and Martin book because it provides explanations and contextualizations to all of the equations, whereas Manning and Schutze's text seems to assume that you know a lot more math.

Incidentally, you can read many chapters of the forthcoming second edition of Jurafsky and Martin's book here.

Even getting a beginner's understanding of NLP can be quite daunting as the field integrates concepts from cognitive psychology, artificial intelligence, linguistics, statistics, computer science, and other related fields.

Good luck! It is a fascinating area of study. If you want a quick-and-dirty intro, here is the entry on Natural Language Processing from the Encyclopedia of Library and Information Science.
posted by rachelpapers at 10:15 AM on April 4, 2007


« Older Help me get my Seattle crunk on!   |   This is supposed to be the easy part . . . Newer »
This thread is closed to new comments.