Laurent Orseau, Mark Ring. Self-Modification and Mortality in Artificial Agents. AGI, 2011.

Abstract

This paper considers the consequences of endowing an intelligent agent with the ability to modify its own code. The intelligent agent is patterned closely after AIXI (Hutter, 2004), but the environment has read-only access to the agent's description. On the basis of some simple modifications to the utility and horizon functions, we are able to discuss and compare some very different kinds of agents, specifically: reinforcement-learning, goal-seeking, predictive, and knowledge-seeking agents. In particular, we introduce what we call the "Simpleton Gambit" which allows us to discuss whether these agents would choose to modify themselves toward their own detriment.