Next: Bias-Optimal Proof Search (BIOPS) Up: How Online Proof Techniques Previous: Instructions/Subroutines for Making &

### Global Optimality Theorem: Self-Improvement Strategy is not Greedy

Intuitively, at any given time the Gödel machine should execute some self-modification algorithm only if it is the `best' of all possible self-modifications, given the optimality criterion, which typically depends on available resources, such as storage size and remaining lifetime. At first glance, however, target theorem (2) seems to implicitly talk about just one single modification algorithm, namely, switchprog as set by the systematic proof searcher at time . Isn't this type of local search greedy? Couldn't it lead to a local optimum instead of a global one? No, it cannot, according to the global optimality theorem:

Theorem 2.1 (Global Optimality of Self-Changes)   Given any formalizable utility function (Item 1f), and assuming consistency of the underlying formal system, any Gödel machine self-change obtained through execution of some program switchprog identified through the proof of a target theorem (2) is globally optimal in the following sense: the utility of starting the execution of the present switchprog is higher than the utility of waiting for the proof searcher to produce an alternative switchprog later.

Proof. The target theorem (2) implicitly talks about all the other switchprogs that the proof searcher could produce in the future. To see this, consider the two alternatives of the binary decision: (1) either execute the current switchprog, or (2) keep searching for proofs and switchprogs until the systematic searcher comes up with an even better switchprog. Obviously the second alternative concerns all (possibly infinitely many) potential switchprogs to be considered later. That is, if the current switchprog were not the `best', then the proof searcher would not be able to prove that setting switchbit and executing switchprog will cause higher expected reward than discarding switchprog, assuming axiomatic consistency.

That is, only if it is provable that the current proof searcher cannot be expected to find sufficiently quickly an even better switchprog, will it also be provable that the current switchprog should be executed. So the trick is to let the formalized improvement criterion take into account the (expected) performance of the original search method, which automatically enforces globally optimal self-changes, relative to the provability restrictions.

Will the Gödel machine ever prove a target theorem? This obviously depends on the nature of environment and utility function. See Section 3.1 for intuitive examples of very simple, limited self-rewrites (affecting only near-future rewards) whose benefits should be easily provable, given appropriate axioms, and compare Section 3.4 on fundamental limitations of the Gödel machine.

Next: Bias-Optimal Proof Search (BIOPS) Up: How Online Proof Techniques Previous: Instructions/Subroutines for Making &
Juergen Schmidhuber 2003-12-11