Set-up and Formal Goal

Next: Basic Idea of Gödel Up: Overview / Basic Ideas Previous: Overview / Basic Ideas

Set-up and Formal Goal

Our hardware (e.g., a universal or space-bounded Turing machine [56] or the abstract model of a personal computer) has a single life which consists of discrete cycles or time steps $t=1, 2, \ldots$ . Its total lifetime

may or may not be known in advance. In what follows, the value of any time-varying variable

at time

will be denoted by

During each cycle our hardware executes an elementary operation which affects its variable state $s \in \cal S \subset B^*$ (without loss of generality, is the set of possible bitstrings over the binary alphabet $B=\{ 0, 1 \}$ ) and possibly also the variable environmental state $Env \in \cal E$ (here we need not yet specify the problem-dependent set $\cal E$ ). There is a hardwired state transition function $F: \cal S \times \cal E \rightarrow \cal S$ . For , is the state at a point where the hardware operation of cycle is finished, but the one of has not started yet. may depend on past output actions encoded in and is simultaneously updated or (probabilistically) computed by the possibly reactive environment.

In order to talk conveniently about programs and data, we will often attach names to certain string variables encoded as components or substrings of . Of particular interest are the three variables called time, x, y, and p:

At time , variable holds a unique binary representation of . We initialize $time(1)=\lq 1\textrm{'}$ , the bitstring consisting only of a one. The hardware increments from one cycle to the next. This requires at most and on average only computational steps.
Variable x holds the inputs form the environment to the Gödel machine. For , may differ from only if a program running on the Gödel machine has executed a special input-requesting instruction at time . Generally speaking, the delays between successive inputs should be sufficiently large so that programs can perform certain elementary computations on an input, such as copying it into internal storage (a reserved part of ) before the next input arrives.
Variable y holds the outputs of the Gödel machine. is an output bitstring which may subsequently influence the environment, where $y(1)=\lq 0\textrm{'}$ by default. For example, could be interpreted as a control signal for an environment-manipulating robot whose actions may have an effect on future inputs.
is the initial software: a program implementing the original (sub-optimal) policy for interacting with the environment, represented as a substring of , plus the original policy for searching proofs. Details will be discussed below.

At any given time ( $1 \leq t \leq T$ ) the goal is to maximize future success or utility. A typical ``value to go'' utility function is of the form $u(s, Env): \cal S \times \cal E \rightarrow R$ , where $\cal R$ is the set of real numbers:

$\begin{displaymath} u(s, Env) = E_{\mu} \left [ \sum_{\tau=time}^{E_{\mu}(T \mid s, Env)} r(\tau)~~ \Bigg\vert ~~s, Env \right ], \end{displaymath}$

(1)

where

is a real-valued reward input (encoded within

) at time

, $E_{\mu}(\cdot \mid \cdot)$ denotes the conditional expectation operator with respect to some possibly unknown distribution $\mu$ from a set

of possible distributions (

reflects whatever is known about the possibly probabilistic reactions of the environment), and the above-mentioned

is a function of state

which uniquely identifies the current cycle. Note that we take into account the possibility of extending the expected lifespan $E_{\mu}(T \mid s, Env)$ through appropriate actions.

Alternative formalizable utility functions could favor improvement of worst case instead of expected future performance, or higher reward intake per time interval etc. Clearly, most classic problems of computer science can be formulated in this framework--see examples in Section 6.2.

Next: Basic Idea of Gödel Up: Overview / Basic Ideas Previous: Overview / Basic Ideas

Juergen Schmidhuber 2005-01-03