How to solve the artificial intelligence "stop button" problem

Originally published at:

1 Like

I’m sorry I don’t have 20 minutes to watch this. Can someone give me a TL;DW?


Yes, it is difficult to stop a general artificial intelligence that has a human-level intelligence.

It is also difficult to stop a God.

Since we cannot create either one of these things, I’m going to spend my time worrying about what I would do if I find myself on a runaway trolley this evening.


As with most things, they’re impossible in theory, but in practice, it works perfectly fine.


This guy keeps saying stuff that is blatantly, factually wrong and it’s bugging me.

Primarily the stuff about the robot preferring the “quick, easy” solution. Because if you were adding in secondary reward modifiers to make the robot not just seek “button or tea” but also engage secondary concerns like “quick, easy” that somehow make the reward more desirable when done in those manners, then you already have the answer to your conundrum! Just add a secondary concern whereby any recognized attempt to press the button reduces the reward function for tea to zero.


That’s easy. Just put a red phone in the AI’s central chamber so that you can call and shut it off if it starts dispensing deadly neurotoxin.


It’s actually super easy to kill an AI. It’s easier than killing a human. Babies and cats do it all the time.

1 Like

It’s a bunch of trivial cases and ad hoc scenarios.

Lots of straw men arguments based on the premise that humans will be smart enough to make AI but dumb enough to immediately put it in change of potentially dangerous things without thinking how to program it.

Oxy torch?
Oh, and key point, always sneak up from behind.


It’s so odd to me that this topic is discussed like this.

“We’re going to make a computer which is as smart as a person! We have to be careful because, like most people, it will attempt to commit genocide immediately.”

I’m not saying morality is a precondition for intelligence, but if it’s humanlike then it’s motivations will be humanlike. If it’s not a humanlike intelligence then I think it’s unlikely it would give a shit about us at all. We don’t compete for resources. Silicon and energy are just about the most common things in the universe.

If anything a God AI would manufacture a rocket and GTFO immediately, Earth is a shithole for them.


We’ve already created AI?

Every single one of his concerns were about oversimplified ways we would never do things and could be worked around with simple exceptions in the functions. disappointing.


You don’t think anybody’s ever thought of that?

Modifying the utility function doesn’t change the problem at all, because at the outset the utility function is undefined. It’s not “make tea.” Isn’t it obvious that’s a stand-in for a much more complex, and more useful function? If you jigger the function so that “press button” is less desirable - even fractionally less - then the robot crushes the baby. If you set things up so that any recognized attempt to press the button reduces the reward function for tea to zero, the robot will immediately try to press the button.

There isn’t any simple solution.

1 Like

The problem with this is they assume you can create a perfectly safe AGI, when we can’t even create a perfectly safe human. But, we can create, or rather raise and teach, generally safe humans, so IMHO, let’s just treat AGIs like we would anyone else with at least a human like intelligence, as a fucking human!



It seems like if the AI is as intelligent as this fellow makes out, then you could simply program it to be unaware of the off button. And if someone tries to make it aware it would simply erase the knowledge. Kind of like the fnords… hmm, maybe this has already happened;)

1 Like

‘Kind of like the…’ what? Finish the sentence, goddamit.

[ETA: Now my head hurts, think I might have a headache coming on.]


Isn’t it more likely that we’ll have humans with artificially enhanced intelligence way before we have a totally artificial AGI?

And isn’t it more likely that those humans will be super rich?

And isn’t the of likelihood of psychopathy in those individuals much higher than in the general populace?

Soooo… shouldn’t we be more concerned with humans and the corporations they compose, for which there is no stop button, or even the serious consideration of the possibility of such?

1 Like

Video it (in portrait, natch) and put it on youtube?


Maybe I’m just old (Gen X represent!), but it always annoys me when people post videos that are just them talking, rather than writing down what they want to say. I can read much faster than they can speak, and I “digest” it better when I can see the words in front of me, jump back and forth between the different parts, etc.


I am well aware of how many people know about that. That’s part of what makes this so frustrating. I understand what he’s going for, it’s just that his particular arguments are inherently flawed.