There's something eerie about bots that teach themselves to cheat


Originally published at:




winning trump


I can’t find the reference now, but there’s a story about some experiment with some flavor of self-teaching software. It took the form of bot “organisms” that lived in a simple grid with resources, energy costs, and so forth. The idea was for the bots to “grow” as large and strong as possible through efficient resource-gathering strategies. The program ran for a certain number of turns, then the memory was wiped and the program ran again based on the results of the last trial.

Pretty soon they noticed something weird: bots were racking up impossibly high scores–growing bigger than was mathematically possible in a given time or with a given resource grid. It turned out that the bots were crawling “off the screen” and out of the memory addresses set aside for the game. There were no resources out there–but those locations also weren’t wiped out by the end-of-game genocide. When the game restarted, the survivor bots would just wander back into the game area and continue feeding.

What I’m saying is, there’s no shame if humanity gets wiped out by something like that. This is something with like 64 bytes of memory that crawled outside its own universe just to cheat death. I can barely be bothered to floss.


One of the earlier fiction examples I remember seeing about this was in The Godwhale:

IIRC Most humans on Earth live in a self-maintained people-hive, where various specializations of citizen are constantly being artificially birthed and nurtured. The robots at an early stage of this process are given the job to optimize infant survival rates prior a later stage where other bots evaluate and cull infants in excess of need or which don’t fit a particular criteria. One of the protagonists survives this only because some robot realized that it upped it’s infant survival success numbers if it dumped babies inside a wall rather than send them on to the culling.

I’m sure there are earlier examples, but this stuck in my mind because it was less the overtly dramatic gotcha “I built a machine…oh, no! It doesn’t think like I think!” surprise plot. More just presented as an understandable thing that might happen, cause how would a machine know not to do something crazy to maximize a variable.



It is a nice illustration of the issue of metrics and perverse incentives.

If you are going to exclusively focus on a measurable outcome you need to be really sure that the outcome is actually what you want.


What’s the saying?

Something along the lines of “I don’t fear the AI that passes the turing test, I fear the AI that intentionally fails it.” or something like that.


(It seems I posted this in the wrong thread before, reposting here.)

Cory wrote:

I love the list, but wish it came with links to the studies!

The original list, with references to the studies, appears to be here

with accompanying blog post:

Also, the original source for the majority of them is this entertatining and accessible scientific paper - well worth a look if this is a topic that interests you.

Joel Lehmann et al. (2018) The Surprising Creativity of Digital Evolution

(It’s kind of annoying, when you think about it, that the media picked up on DeepMind’s version rather than the original, because I think a huge amount of work went into that paper.)


My go-to example is always The Rise of Endymion by Dan Simmons, which I will adamantly maintain is a terrible, terrible sequel to an equally brilliant book (i.e. Hyperion).

“It’s an interesting footnote that those Core personae which survive the Reapers do so not just though parasitism, but through a necrophilic parasitism. This is the technique by which the original 22-byte artificial life-forms managed to evolve and survive in Tom Ray’s virtual evolution machine so many centuries ago–by stealing the scattered copy code of other byte creatures who were ‘reaped’ in the midst of reproducing. The Core parasites not only have sex, they have sex with the dead! This is how millions of the mutated Core personae survive today … by necrophilic hyperparasitism.”

“oh those evil evil computers have nonconformist sexual practices they are doubly evil now”

I was rather frustrated to find that although the cited Mr. Ray and his “terrarium” experiment are apparently a real thing, there’s surprisingly little accessible information on the details of his simulation.


So what if, instead of a computer, you had a sufficiently elaborate system of pricing real world objects. And instead of software, you had immortal legal entities built up from layers of people. You think maybe these engines of capitalism might find ways of gaming the system that had nothing to do with human well-being?


According to wikipedia, it seems lots of details are here:

I can’t seem to get the page to load but there’s a fairly recent snapshot on the Wayback Machine.

It seems to have the code, instructions for compilation and documentation.


This topic was automatically closed after 5 days. New replies are no longer allowed.