Here’s how we could actually measure AI

jerf · on June 12, 2014

The rush to redefine a better Turing test seems pretty misguided in light of the fact that we all agree that "Eugene" in fact fails any reasonable interpretation of it, quite badly, and the real problem here is the credulousness or even outright lying of the people who ran the test.

A bot that could persistently convince a person of average intelligence that it was human across an extended period of time would still be quite the thing. A bot that could convince me that it was human would require someone to leap some pretty significant AI barriers that so far nobody has even come close to. It may not be the touchstone of "true AI" because that's a very slippery term, but it's a legitimate milestone, and the fact that it's 2014 and "Eugene" is still the best we have(give or take a bit) is evidence for the idea that it's a hard test, not evidence against!

It doesn't matter what test we use... credulous or deceptive people are still going to prematurely cry "success!". Twiddling with the test isn't solving the real problem here. (Again, as always, step one is solving the problem is identify the problem. It's often harder than it seems at first....)

TuringTest · on June 12, 2014

Was the Turing Test ever about measuring intelligence? I always thought that it was about thought, i.e. the existence of conscience.

The test was in essence a call to empathy: a reminder that, if an entity was so complex so as to exhibit such behavior that we couldn't distinguish it from an educated fellow human, it would be impolite to treat it as "inhuman" or "a thing".

So why use it to measure intelligence, when other tests like I.Q. are designed specifically for that?

codeulike · on June 12, 2014

Yeah the Wired article is altogether crap. It describes the Turing Test as being about 'imitating human conversation', missing the point entirely.

TuringTest · on June 12, 2014

Well, 'imitating human conversation' is what the published versions of the Test seem to be about.

Unfortunately it seems the current state of the art has only devoted to imitate the superficial parts of it. I'm not aware of anyone trying to build SHRDLU-like model for conversation that tried to understand and remember what is being said, rather than merely reacting to it.

sp332 · on June 12, 2014

IBM's Watson and the OpenCYC project are doing that.

a Cycorp engineer informed the system they would be discussing anthrax. Cyc responded: "Do you mean Anthrax (the heavy metal band), anthrax (the bacterium), or anthrax (the disease)?" Asked to comment on the bacterium's toxicity to people, it replied: "I assume you mean people (homo sapiens). The following would not make sense: People Magazine."

http://www.cs.ucla.edu/~klinger/articles/thinking_la_times_6...

thegeomaster · on June 12, 2014

Here [1] is a list of some Winograd schemata devised for testing artificial intelligence chatterbots. It boggles my mind how diverse an AI's knowledge of the world and relationships contained therein would have to be to resolve these, especially when you think about how natural that process is to us, up to the point that we don't even recognize the ambiguity.

[1]: http://www.cs.nyu.edu/davise/papers/WS.html

qbrass · on June 12, 2014

Compared to other tests, it would be easier to work those into everyday conversation. That way, you don't have to know you're judging a Turing test to discover one of them.

jdmitch · on June 12, 2014

The article mentions language processing, interpreting audio and visual material, and playing games as goals that have been set for AI, and then suggests as an alternative goals that "can expand with our own abilities and desires."

But there is little mention of creativity, empathy or emotion - aren't these the types of dynamic, responsive cognitive abilities that are uniquely human? Certainly there's an ethical dimension to the question of whether we should strive to recreate these in AI, but they seem like the type of goals that depend on complex and dynamic intelligence, that would be the most difficult types of tests for computers to pass.

qbrass · on June 12, 2014

>But there is little mention of creativity, empathy or emotion - aren't these the types of dynamic, responsive cognitive abilities that are uniquely human?

I wouldn't say "uniquely human", other animals show signs of all of those. Still, they're not really represented in machines. I don't believe that machines are incapable of it, or that people are trying to avoid implementing it; more that the basics still need improving before we need to worry about it.

On the other hand, it's possible that people are working on AI from the wrong angle, and creativity, empathy, and emotion are needed at a much lower level and would make imparting knowledge to machines easier.

SeanLuke · on June 12, 2014

I think this article badly misunderstands the whole point behind Turing's test: that we don't have a definition of intelligence. We just have a model: us. Lacking a definition, coming up with measures is foolhardy, and yet here we are with an article attempting exactly that.

> Machine learning researcher Hector Levesque of the University of Toronto proposes that resolving such ambiguous sentences, called Winograd schema, is a behavior worthy of the name intelligence.

> Humans are also exceptionally good at recognizing faces.

> We could further ask for the computer to interpret audio-visual phenomena and then reason about them.

These qualities are neither sufficient nor necessary conditions of intelligence. Would Hellen Keller pass the last two? Computers are getting very good, very quickly, at facial recognition. Once a computer beats people at this task, would it be intelligent?

What Turing was getting at was to devise a test which didn't measure anything: it merely exploited the fact that we just have a model. Until we face that, such "revisions" are absurd.

segmondy · on June 12, 2014

As a budding AI researcher. I don't believe there is any specific test to measure AI.

Let me explain. We humans are smart, we have common sense reasoning, this is what computers really lack and gives them the disadvantage in communicating with us like fellow humans.

We can recognize this, and I believe the best AI test is just simple to throw it out there where a lot of people can use the system and when they are done, if they agree that it's smart, it's smart and we can call it AI. There's no rule, there's no trick such as trying to fool someone. It could be a casual chat, it could be a serious chat about some topics that both party learned before hands, it could be a game playing of sort.

eximius · on June 12, 2014

You know, I think a machine capable of recognizing puns would be very close to some measure of real intelligence.

This thought was instigated by some friends talking about some animals. "There was a herd of them?" "Yea, I heard them. Ayyy!" A machine recognizing puns must be able to recognize auditory similarity, while recognizing that their meaning is related in some undefined way while being distinct words.

caster_cp · on June 12, 2014

Has no one read "Do Androids Dream of Electric Sheep"? I think all this conversation would be greatly enlightened if we start thinking on the opposite way: when AI is so advanced, how can we tell human and machine "behavior" apart?

Putting it another way: is the Philip K Dick's Voigt-Kampff test the inverse of the Turing test?

sp332 · on June 12, 2014

There are lots of ways to be intelligent without being very humanish. Unless someone is really trying to imitate human intelligence specifically, such a test would probably be really easy.