next up previous
Next: Proof Techniques and an Up: Overview / Basic Ideas Previous: Set-up and Formal Goal

Basic Idea of Gödel Machine

Our machine becomes a self-referential [11] Gödel machine by loading it with a particular form of machine-dependent, self-modifying code $p$. The initial code $p(1)$ at time step 1 includes a (typically sub-optimal) problem solving subroutine $e(1)$ for interacting with the environment, such as any traditional reinforcement learning algorithm [20], and a general proof searcher subroutine (Section 5) that systematically makes pairs (switchprog, proof) (variable substrings of $s$) until it finds a proof of a target theorem which essentially states: `the immediate rewrite of p through current program switchprog on the given machine implies higher utility than leaving p as is'. Then it executes switchprog, which may completely rewrite $p$, including the proof searcher. Section 3 will explain details of the necessary initial axiomatic system $\cal A$ encoded in $p(1)$. Compare Fig. 1.

Figure: Storage snapshot of a not yet self-improved example Gödel machine, with the initial software still intact. See text for details.
\begin{figure}\centerline{\psfig{figure=figure1.eps,angle=0,height=12cm}}\end{figure}

The Global Optimality Theorem (Theorem 4.1, Section 4) shows this self-improvement strategy is not greedy: since the utility of `leaving $p$ as is' implicitly evaluates all possible alternative switchprogs which an unmodified $p$ might find later, we obtain a globally optimal self-change--the current switchprog represents the best of all possible relevant self-changes, relative to the given resource limitations and initial proof search strategy.


next up previous
Next: Proof Techniques and an Up: Overview / Basic Ideas Previous: Set-up and Formal Goal
Juergen Schmidhuber 2005-01-03