A computer program that could learn... anything?
February 23, 2014 1:44 AM

Has anyone ever tried to make a generalized learning program, such that it could be fed any kind of data, and would correlate it all, and eventually be able to form new conclusions or answer complex questions relating to the data?

I'm sure smart people have already thought about this, but I don't know exactly what you'd call this concept, so I'm unable to search for it. Let me give an example of the concept:

You have this computer program. All it "knows" to begin with are simple rules of logic and how to parse simple sentences. You start feeding it data. You feed it sentences like

* Apples are fruit.

* Breakfast cereals often contains much sugar.

* Fresh vegetables are good for your health.

* Jenny is a vegan.

And so on and so forth. Anything related to food. When you give it these sentences, it doesn't know what most of the words refer to, but it can remember the connections. It doesn't know what Jenny is or what a vegan is, but it knows that Jenny is a vegan. Later you tell it that Tom is a vegan. You also tell it that vegans don't eat meat. Though it still doesn't know what any of these things are, it now knows that Jenny and Tom both don't eat meat.

As you continue to feed it information, it continues to draw as many connections as possible. After a while, it not only knows that Jenny does not eat meat, but it knows that she won't eat most cakes (even though it was never given that sentence). When it sees a conflict, or an opportunity to make many connections as possible, it asks clarifying questions. Questions like, "Are all humans vegan?" or "Can fruit be eaten raw?" or "Can milk be fried?" or "Do shrimp taste better when they are bitter?" or "What is the difference between a sandwich and a calzone?" Every question can be answered with a simple answer, or with an answer which introduces a new concept, or with something fuzzy like "Most of the time" or "I don't know" or "I don't want to answer that right now".

Eventually, after receiving enough input (maybe from many many users), the program starts asking more subtle questions. Eventually, you can even ask it your own questions. You can say to the program, "I'm making a soup with ingredients X, Y, and Z. How should I spice it?" Or you can say, "What side dish would go well with steak?" Given enough data, and enough correlation, the program should be able to provide you with useful answers.

The program need not be used only for food. You could also enter information about clothing, or cars, or weapons, or even relationships.

This concept makes sense in my mind, but I have a hard time determining whether it's feasible, even theoretically. In a sense, it's basically the question of whether we could program something that functioned largely like a human brain. What makes it think it might be possible is the fact that it seems like the initial program wouldn't actually have to be that complicated; all it would really need, like I said above, is the ability to use logic and understand simple sentences. Well, and probably a pretty hefty processor.
posted by CustooFintel to Technology (17 answers total) 3 users marked this as a favorite
Have you ever seen Wolfram Alpha?
posted by ClaireBear at 2:00 AM on February 23, 2014


Douglas Lenat did something like this, first with a program called AM (automated mathematician), limited to the domain of mathematics. You set it up with a few axioms and it starts asking its own questions, trying to discover new theorems. Later he generalized it to other domains with Eurisko and Cyc, which is probably closest to what you're thinking of.
posted by zanni at 2:01 AM on February 23, 2014


Cleverbot is not exactly like this, but not completely different either. More information.
posted by A Thousand Baited Hooks at 2:09 AM on February 23, 2014


Here is some AI terminology for problems like this:

Commonsense knowledge
Expert system
posted by equalpants at 2:45 AM on February 23, 2014


This was tried a while back (unfortunately I can't remember the name of the project, arrgh - maybe it was Cleverbot as mentioned above). It looked very similar to what you're suggesting. They asked for general input from People On The Internet to feed it information. People On The Internet started feeding it wrong/funny answers because that was amusing. They had to restart/switch to another model for feeding it information. Ring any bells with anyone else?
posted by gnimmel at 3:25 AM on February 23, 2014


The name for this setup is an "expert system." They are very useful but don't bear much relationship to how the brain works -- we don't have an enormous memorized ruleset in our heads. An expert system isn't AI in any meaningful sense, it's more like a highly cross referenced encyclopedia than like a brain.
posted by ook at 4:23 AM on February 23, 2014


I saw a really cool talk about Watson, the Jeopardy-playing IBM project, being used for medical diagnosis. It's a natural language recognition program and it's supposed to make exactly the kind of indirect links you're talking about.
posted by tchemgrrl at 4:35 AM on February 23, 2014


is the ability to use logic and understand simple sentences
These are actually very difficult for computers to do. They are such difficult projects, in fact, that it's easier for programmers to guess what a person is likely to ask, or to store up a database of what many people have previously asked, and then serve up the answers that they liked. This is what google is doing when you type in "why does" and it has several suggestions to complete your question. It doesn't actually understand what you're asking. An useful analogy might be to think about robots assembling car parts. They're definitely doing all the things that used to be done by humans, but they don't actually understand or have any cognition whatsoever going on.
posted by kavasa at 5:05 AM on February 23, 2014


Gnimmel, you might be thinking of these guys? In any case that's what this question made me think of, and the "comparison to other projects" section of that link gives a number of other, similar endeavors. Of especial interest may be Freebase, which was acquired by Google a few years back.

In short, you aren't the first person to come up with this approach, and it is waaaay harder than it sounds.
posted by town of cats at 5:10 AM on February 23, 2014


Yeah, this is exactly what IBM's Watson is for. People think of it as a Jeopardy playing computer, but Jeopardy was just a way to test it, and later used for PR purposes.

Watson doesn't ask questions, though. It forms its own "conclusions" based on information fed into it, but it can't ask a user for clarification.
posted by Sara C. at 7:13 AM on February 23, 2014


Natural languages are extraordinarily difficult to handle algorithmically. The classic example of this is the sentence:
Love flies like a breeze, fruit flies like a banana.
Every language has puns and idioms, and those represent nearly insurmountable obstacles for the kind of computer program you're talking about.
posted by Chocolate Pickle at 7:38 AM on February 23, 2014


There is a branch of computer science that studies "Ontologies", which are ways of formally documenting information and the relationships between different data points, in such a way as to let computers understand that information.

One of the big places this is used is in research on The Semantic Web. The Semantic web is the study of how web pages could be created such that computers could understand both the content of the pages and the connections between linked pages.
posted by nalyd at 7:50 AM on February 23, 2014


You might be interested in exploring logic programming, and Prolog in particular. It's not a magic way of handling natural language or asking for questions, but it's a language where you put in facts and it finds relationships based on those facts without having to 'understand' the meaning to us.
posted by 168 at 8:08 AM on February 23, 2014


I was going to mention Prolog and Expert Systems as well. While this seems straightforward it is incredibly difficult and hard in practice for lots of situations.
posted by mmascolino at 10:57 AM on February 23, 2014


This is much easier if you remove the requirement to parse natural language. In that case, tools like RDF can be used to build databases of facts from which new information can be inferred.
posted by russm at 2:22 PM on February 23, 2014


Here's some more detail on Watson. It really is a generic learning system that will generate answers based on whatever type of corpus you feed it. They even have ideas for you on how you can build your own.

Full Disclosure: I work for IBM, but thus far have had nothing to do with Watson, except avid reading.
posted by Roger Dodger at 3:42 PM on February 23, 2014


There's a company working on that. There's lot's of discussion in the research world about why it has not worked for anyone yet. Turns out it's just really hard, some folks don't think it'll ever be possible. Research on the brain keeps turning up more amazing deep complexities, so goggle might just get it working or it may be a much harder problem, no one actually knows.
posted by sammyo at 5:58 PM on February 23, 2014


« Older Magnetic Fields: Great band, better sixth sense?   |   Syndroms or Effects named after TV Shows? Newer »
This thread is closed to new comments.