OOPS For Finding Universal Solvers

Now suppose there is an ordered sequence of tasks $r_1, r_2, \ldots$ . Task

may or may not depend on solutions for

$(i,j=1,2,\ldots, j>i).$ For instance, task

may be to find a faster way through a maze than the one found during the search for a solution to task $r_{j-1}$ .

We are searching for a single program solving all tasks encountered so far (see [9] for variants of this setup). Inductively suppose we have solved the first

tasks through programs stored below address $a_{frozen}$ , and that the most recently found program starting at address $a_{last} \leq a_{frozen}$ actually solves all of them, possibly using information conveyed by earlier programs. To find a program solving the first

tasks, OOPS invokes Try as follows (using set notation for ring

Method 2.2 (OOPS (n+1)) Initialize

$qp := a_{frozen}$ .

1. Set $R = \{ r_{n+1} \}$ and $ip(r_{n+1}) := a_{last}$ . IF Try ( $qp, r_{n+1}, R, 0, 0.5$ ) then exit.

2. IF go to 3. Set $R = \{ r_1, r_2, \ldots, r_{n+1} \}$ ; set local variable $a := a_{frozen} + 1$ ; for all $r \in R$ set . IF Try ( $qp, r_{n+1}, R, 0, 0.5$ ) set $a_{last} := a$ and exit.

3. Set , and go to 1.

Therefore, given tasks $r_1, r_2, \ldots,$ we first initialize $a_{last}$ ; then for $i := 1, 2, \ldots$ invoke OOPS

to find programs starting at (possibly increasing) address $a_{last}$ , each solving all tasks so far, possibly eventually discovering a universal solver for all tasks in the sequence. As address $a_{last}$ increases for the

-th time,

is defined as the program starting at $a_{last}$ 's old value and ending right before its new value. Clearly,

(

) may exploit

Optimality. OOPS not only is asymptotically optimal in Levin's sense [6] (see Method 1.1), but also near bias-optimal (Def. 2.1). To see this, consider a program

solving problem

within

steps, given current code bias $q_{0:a_{frozen}}$ and $a_{last}$ . Denote

's probability by

. A bias-optimal solver would solve

within at most

steps. We observe that OOPS will solve

within at most

steps, ignoring overhead: a factor 2 might get lost for allocating half the search time to prolongations of the most recent code, another factor 2 for the incremental doubling of

(necessary because we do not know in advance the best value of

), and another factor 2 for Try's resets of states and tasks. So the method is 8-bias-optimal (ignoring hardware-specific overhead) with respect to the current task.

Our only bias shifts are due to freezing programs once they have solved a problem. That is, unlike the learning rate-based bias shifts of ADAPTIVE LSEARCH [10], those of OOPS do not reduce probabilities of programs that were meaningful and executable before the addition of any new

. Only formerly meaningless, interrupted programs trying to access code for earlier solutions when there weren't any suddenly may become prolongable and successful, once some solutions to earlier tasks have been stored.

Hopefully we have

, where

is among the most probable fast solvers of

that do not use previously found code. For instance,

may be rather short and likely because it uses information conveyed by earlier found programs stored below $a_{frozen}$ . E.g.,

may call an earlier stored

as a subprogram. Or maybe

is a short and fast program that copies

into state

, then modifies the copy just a little bit to obtain $\bar{q}^i$ , then successfully applies $\bar{q}^i$ to

. If

is not many times faster than

, then OOPS will in general suffer from a much smaller constant slowdown factor than LSEARCH, reflecting the extent to which solutions to successive tasks do share useful mutual information.

Unlike nonincremental LSEARCH and HSEARCH, which do not require online-generated programs for their aymptotic optimality properties, OOPS does depend on such programs: The currently tested prefix may temporarily rewrite the search procedure by invoking previously frozen code that redefines the probability distribution on its suffixes, based on experience ignored by LSEARCH & HSEARCH (metasearching & metalearning!).

As we are solving more and more tasks, thus collecting and freezing more and more

, it will generally become harder and harder to identify and address and copy-edit particular useful code segments within the earlier solutions. As a consequence we expect that much of the knowledge embodied by certain

actually will be about how to access and edit and use programs

(

) previously stored below