** Next:** Bias-Optimal Proof Search (BIOPS)
** Up:** How Online Proof Techniques
** Previous:** Instructions/Subroutines for Making &

###

**Global Optimality Theorem:**
Self-Improvement Strategy is not Greedy

Intuitively, at any given time the Gödel machine should
execute some self-modification algorithm only if it is
the `best' of all possible self-modifications,
given the optimality criterion, which typically
depends on available resources, such as storage size and
remaining lifetime.
At first glance, however, target theorem (2)
seems to implicitly talk about just
one single modification algorithm, namely, *switchprog*
as set by the systematic proof searcher at time .
Isn't this type of local search greedy? Couldn't
it lead to a local optimum instead of a global one?
No, it cannot, according to the global optimality theorem:

**Theorem 2.1** (Global Optimality of Self-Changes)
Given any formalizable utility function

(Item

1f),
and assuming consistency of the underlying formal system,
any Gödel machine self-change obtained through execution of
some program

*switchprog* identified
through the proof of a target theorem (

2)
is globally optimal in the following sense:
the utility of starting the execution of the present

*switchprog* is higher than the utility of
waiting for the proof searcher
to produce an alternative

*switchprog* later.

**Proof.** The target theorem (2)
implicitly talks about all the other
*switchprog*s that the proof searcher
could produce in the future. To see this, consider
the two alternatives of the binary decision:
(1) either execute the current *switchprog*, or
(2) keep searching for *proof*s and *switchprog*s
until the systematic
searcher comes up with an even better *switchprog*.
Obviously the second alternative concerns all (possibly infinitely
many) potential *switchprog*s to be considered later. That is,
if the current *switchprog* were not the `best', then
the proof searcher would not be able to prove that
setting *switchbit* and
executing *switchprog* will cause higher expected reward
than discarding *switchprog*, assuming axiomatic consistency.

That is, only if it is provable that the
current proof searcher cannot be expected to find
sufficiently quickly an even better *switchprog*,
will it also be provable that the *current* *switchprog*
should be executed.
So the trick is to let the formalized improvement
criterion take into account the (expected) performance of
the original search method, which automatically enforces
globally optimal self-changes, relative
to the provability restrictions.

Will the Gödel machine ever prove a target theorem?
This obviously depends on the nature of
environment and utility function. See Section 3.1
for intuitive examples of very simple, limited self-rewrites
(affecting only near-future rewards) whose
benefits should be easily provable, given appropriate axioms,
and compare Section 3.4 on fundamental limitations of the Gödel machine.

** Next:** Bias-Optimal Proof Search (BIOPS)
** Up:** How Online Proof Techniques
** Previous:** Instructions/Subroutines for Making &
Juergen Schmidhuber
2003-12-11

Back to Goedel machine home page