How to solve the artificial intelligence "stop button" problem

This guy keeps saying stuff that is blatantly, factually wrong and it’s bugging me.

Primarily the stuff about the robot preferring the “quick, easy” solution. Because if you were adding in secondary reward modifiers to make the robot not just seek “button or tea” but also engage secondary concerns like “quick, easy” that somehow make the reward more desirable when done in those manners, then you already have the answer to your conundrum! Just add a secondary concern whereby any recognized attempt to press the button reduces the reward function for tea to zero.

6 Likes