Death to the "chatbot passed a Turing Test" story


#1

[Permalink]


#2

I dressed up a mannequin like a hobo and left it slumped against a wall. Some people across the street mistook it for a real person.

That's right: I built an android that was so lifelike that it could pass for an actual human being. I'll take that Nobel Prize money in the form of cashier's checks, thank you.


Charity collection-boxes shaped like life-sized homeless people
#3

Well, it helps that you dressed it up to look like a deaf, mute and Illiterate 13 year-old Ukrainian hobo. That explains to people why he never speaks or writes back to their questions.

As one tech savy commenter noted:

The Brainspore bot enjoyed the advantage of simulating someone who can't write or speak, and whose apparent deafness could explain a lack of nuanced reaction, so you could think of this as kind of a cheat, but it's still a very impressive feat.


#4

You'll only get the Nobel Prize if you prove that people donated at least minimum wage for an 8 hour day to it.


#5

“Oh, what a tangled web we weave...when first we practice to deceive.” Walter Scott

Didn't Turing explicitly lay out the terms for how we might test for the ability of a machine to deceive? Isn't that the point, that our ability to blend in with the others is a core aspect of intelligence?

We regularly give partial credit to the creators of lifelike mannequins. Evolution grants partial credit for deceptive camouflage.

Eugene Goostman gamed one implementation of a Turing test, and deserves 1/3 credit.


#6

So, if I make a baby monitor that is actually just a computer synthesized baby cries and snores, and people think it's a real baby, then I would have " gamed one implementation of a Turing test, and deserve[] 1/3 credit" right?

Changing the rules changes the validity of the test, as brainspore cleverly satirized in the first post.


#7

Yes, I'd say it does deserve credit. The rules of that test would need to allow babies, and this weakens (limits the scope of) the results, just as allowing young second language writers does.

Human baby sounds, their imitation and recognition, are fundamental aspects of our intelligence.


#8

Yeah, no. My baby sound simulator wouldn't be an example of an artificial intelligence that can interactively be mistaken for human sentience. It would be an example of a convincing simulation of baby noises, not sentience. We also recognize cat noises, and by your interpretation: "Feline cat sounds, their imitation and recognition, are fundamental aspects of our intelligence."

The simulation is supposed to be of an artificial intelligence that can pass for human, not a test of our ability to use our intelligence to vet real sounds of life from simulated ones. A CD player on shuffle could convince you of a real cat or baby. That wouldn't be a test of the sentience of a CD player.


#9

When I wake up and look in the mirror, I know I've already failed the Turing Test for the day.


#10


#11

Hehehehehehehhe !! Hararararrararraararar !! Hohohohohohho !! howls of delirious laughter , jorpho !! !! well done !! well done in deed ~


#12


#13

Every time I see Kevin Warwick attached to a story I expect it to be overblown. Needless to say this one wasn't any exception.


#14

As someone who used to write chatbots ten years ago, I was amazed to see how little has changed. This thing is programmed exactly the same way ours were back then - and not too differently from the venerable Eliza program. It just had much branch-ier trees.

Back then some people were working in bots that "learned" from other people, listening in on real chatrooms and the like. I would have been much happier if that were the state of the art now.

No, this bot doesn't get a 1/3 of a credit. The Turing Test was supposed to describe a test that could convince serious people that they were having a serious conversation. Not just "trick" fools into thinking someone has an impediment.

If the judges had been paid good money for everyone they caught, they would have tried harder and obviously caught this one. But the competition only gets publicity if a bot wins, so they have no incentive to do that.


#15

Once upon a time people said we'd know we had artificial intelligence when a machine could beat a person at chess. The real test of artificial intelligence has always been whether a machine can do something that a machine can't do yet.

The Turing Test is pretty silly anyway. What do they say, "You get what you measure"? Of course if there is a fixed test people will try to game it.

I'm not sure what the obsession with human-like intelligence is anyway. If a computer was truly self aware and intelligent, why would it spent it's time trying to convince us it was a human?


#16

The Turing Test is a good test because we know it's an extremely hard problem. That all it represents -- an extremely hard problem.

Even when people were first building programs that could solve chess, it was well understood that it was going to be a lot easier to solve chess than to solve the Turing Test.

Whether passing the Turing Test indicates anything about Artificial Intelligence is just a matter of definition, to be sure. But the reason that it's considered a fairly good measure is that it's hard to see how any program could beat the Turing Test without "human-like" intelligence.

Sure, you could hard-code in more and more and more responses, like these guys who code these chatbots do, but most researchers are quite convinced that such an approach will never fool a human for very long. A real conversation is just too complicated. In order to have a real conversation, you have to do something that's several orders of magnitude more complicated than what these guys are doing.

Now, you're right that if, in fifty years, someone were to be able to write a bot that's 50-trillion hard-coded sentences, and can trick someone for several hours, we'll say "that's still not Artificial Intelligence." But as far as we know now, such a feat won't be possible.


#17

Did any experts ever say that?


#18

I agree that fooling people into thinking they are talking to other people would be a hard thing for a computer to do, but as long as it is a problem people are trying to solve, the solutions aren't going to represent artificial intelligence. As you said above:

People have been diligently working away to improve upon a model that would eventually, with sufficient computing power, be able to fool humans for long enough in a sophisticated enough way that no one could deny the Turing Test had been passed (even if it's not realistic that we are going to do this). But by doing so they aren't making a single step towards anything we'd think of as an intelligence. They are just trying to beat a game that Turing made up.

An AI that fools people into thinking that its a person because it was told that if it managed to fool people it would get the rest of the day off to play Elder Scrolls XXI is a real human-like AI. An AI that programs chat bots to beat the Turing Test in its spare time would be a real human-like AI. That's the kind of thing where we would have to say, "Wow, yeah, that's actually an artificial intelligence."

I didn't mean to say that the Turing Test isn't an interesting challenge, it's just one more thing that, if passed, we could say, "Yeah, but that's not really an AI."


#19

I think that I wasn't being clear about quite how hard this is.

Apart from trivial cases, there have probably never been two identical five-minute-long conversations in all of human history. So even if you had a program with infinite storage capacity and fed every single conversation from history, so that, like looking up moves from a chess database, it had the mot juste for any opening line, it would soon be exposed by a dedicated questioner.

The point is that there are two possible paths towards creating a program that could fool humans for extended periods of questioning: (1) we write some variant of OP's bot, but with trillions upon trillions more sentences and logic, or (2) we create a machine that genuinely seems to think.

It's really hard to know which is harder. We have no evidence that the second is even possible, true, but we're pretty sure the first is impossibly hard.

If the first gets created, then you're right: we'll turn and say "well that was a stupid test, that's not intelligence." But that path may not actually be possible at all. That's why it seems to be a pretty good test -- because as far as we know now, it's not possible to beat it without genuinely being intelligent.


#20

Does this unit have a soul?