Natural Language Programming
March 27, 2004 10:44 AM   Subscribe

what's the current state of natural language programming? in particular, how close are we to:
1 - assessing emotional tone
2 - measuring how one fragment of text relates to another semantically
i'm thinking of technological solutions to the recent metatalk thread on political discussion - the examples above might be used to automatically moderate a group (so the second point measures whether you are contributing something that is both pertinent and new - i imagine that might require an additional "knowledge base").
posted by andrew cooke to Technology (8 answers total)
 
Assessing emotional tone. This is pretty good I think. It's quite easy to cross-reference specific words to determine how emotional a post is. You could use a knowledge base for this, but I think the easiest solution would be some kind of bayesian analysis.
Here's an obligitory metafilter link on Eudora and its Chilli rating for emails.

Measuring Semantic Meaning. In the context of your metatalk thread, I would imagine this is very hard. I could post this, and somebody else could repeat a word (bayesian) or even post a quote of what I said, and suddenly you've got two statements which are semantically similar, and yet, which both add different meaning to the argument. That's my opinion though. Maybe some of the more A.I. attuned people here will be able to weigh in here.

The only thing I've seen recently which claims to measure semantic meaning turned out to be a probable hoax.
posted by seanyboy at 12:11 PM on March 27, 2004


Assessing emotional tone of the words themselves might not be hard. But assessing the intention of the author is "A.I. complete" -- How do you know I'm not joking? How can you tell when I'm quoting others (I have the tarentino-jazzed version of Ezekiel 25:17 as my sig, say)? How can you tell when I'm being ironic (I mean exactly the opposite of what I say). Semantic meaning, btw, is just as hard.
posted by zpousman at 2:41 PM on March 27, 2004


Response by poster: thanks folks. i realise that the whole problem is hard, but i was hoping that there might be a "good enough" solution (incidentally, i got a private email suggesting something interesting might happen sometime, somewhere, related in some way to this ;o) (and no, i don't mean that matt's got an ai moderator in beta).

(i'm vaguely of the opinion that "good enough" will eventually win out - as i think dennett argues - but i'm aware that i'm vastly simplifying things).
posted by andrew cooke at 5:24 AM on March 28, 2004


Thisis something I've been playing around with in my head (as a pure amateur), but I keep coming back to the same problem: even if you could get a SETI@Home-type project going where people "taught" language to a computer, the underlying difference between computers and me is that for all the synonyms and meanings for the word "tree," I have an underlying concept of the meaning that's not tied to language. I think being able to create those underlying concepts would be The Great Leap Forward (and I'm sure a zillion people who know what they're doing have already come to this conclusion), but I can't think of how it would work.

Have you seen Morphix-NLP?
posted by yerfatma at 9:26 AM on March 28, 2004


Response by poster: one attempt to solve the problem you're talking about, yerfatma, is cyc - http://www.opencyc.org/

thanks for the pointer to that cd. looks interesting.
posted by andrew cooke at 9:33 AM on March 28, 2004


Interesting. It's unfortunate their approach is to try to come up with every possible rule that can be imagined and then try to stuff that into a database just because the processing power now exists. Admiting my inherent bias toward the elegant, I'd think a ruleset for learning would be more valuable/ extensible than the sum total of learned human knowledge.

The one part of The Pattern on the Stone that really drew my interest was Hillis' discussion of an evolutionary program he grew (I guess) to sort items in a list. By the tenth generation of evolved code, he had a routine that sorted things quickly. And he doesn't understand the logic inside it. I wonder if such an approach would be valuable here.
posted by yerfatma at 12:24 PM on March 28, 2004


Moderation is a social activity, and it follows social rules that are still really poorly understood. Even if you could understand the meaning and tone of what people write, you might not be any closer to automating the vast number of judgment calls that make a community work. Right now, moderation is a black art that only seems to work when it's in the hands of talented individuals like Matt and Rusty.

It doesn't take many mistakes in interpersonal relationships to destroy a community, and so I expect even "good enough" automated moderation to be AI-complete. Instead, I would look to see more improvement using approaches that work by closely mixing automated tools with human review, a la Slashdot or Google PageRank or spam filters that you have to continuously train yourself. Many2Many is a good place to follow what's happening in this area (and mostly it shows just how limited our understanding of online social dynamics really is).
posted by fuzz at 5:20 AM on March 29, 2004


Response by poster: thanks for that link fzz.
posted by andrew cooke at 12:48 PM on April 11, 2004


« Older Two questions about American culture: Pledge of...   |   How do I customize my newsfeed XML data for others... Newer »
This thread is closed to new comments.