next up previous
Next: ADAPTIVE LS (ALS) Up: Solving POMDPs with Levin Previous: INTRODUCTION


Basic concepts. LS requires a set of $r$ primitive, prewired instructions $p_1, ...,p_r$ that can be composed to form arbitrary sequential programs. Essentially, LS generates and tests solution candidates $s$ (program outputs represented as strings over a finite alphabet) in order of their Levin complexities $Kt(s) = \min_q\{-log P_M(q) + log~t(q,s)\}$, where $q$ stands for a program that computes $s$ in $t(q,s)$ time steps, and $P_M(q)$ is the probability of guessing $q$ according to a fixed Solomonoff-Levin distribution [Li and Vitányi, 1993] on the set of possible programs (in section 3, however, we will make the distribution variable).

Optimality. Amazingly, given primitives representing a universal programming language, for a broad class of problems, including all inversion problems and time-limited optimization problems, LS can be shown to be optimal with respect to total expected search time, leaving aside a constant factor independent of the problem size [Levin, 1973,Levin, 1984,Li and Vitányi, 1993]. Still, until recently LS has not received much attention except in purely theoretical studies -- see, e.g., [Watanabe, 1992].

Practical implementation. In our practical LS version, there is an upper bound $k$ on program length (due to obvious storage limitations). $a_i$ denotes the address of the $i$-th instruction. Each program is generated incrementally: first we select an instruction for $a_1$, then for $a_2$, etc. $P_M$ is given by a matrix $M$, where $M_{ij}$ ( $i \in {1, ..., k}$, $j \in {1, ..., r}$) denotes the probability of selecting $p_j$ as the instruction at address $a_i$, given that the first $i-1$ instructions have already been selected. The probability of a program is the product of the probabilities of its constituents.

LS' inputs are $M$ and the representation of a problem denoted by $N$. LS' output is a program that computes a solution to the problem if it found any. In this section, all $M_{ij} = \frac{1}{r}$ will remain fixed. LS is implemented as a sequence of longer and longer phases:

Levin search(problem $N$, probability matrix $M$)

(1) Set $T$, the number of the current phase, equal to 1. In what follows, let $\phi(T)$ denote the set of not yet executed programs $q$ satisfying $P_M(q)$ $\ge \frac{1}{T}$.

(2) Repeat

(2.1) While $\phi(T) \neq \{\}$ and no solution found do: Generate a program $q \in \phi(T)$, and run $q$ until it either halts or until it used up $\frac{P_M(q)T}{c}$ steps. If $q$ computed a solution for $N$, return $q$ and exit.

(2.2) Set $T$ := 2$T$

until solution found or $T$ $\ge$ $T_{MAX}$.
Return $\{\}$.

Here $c$ and $T_{MAX}$ are prespecified constants. The procedure above is essentially the same (has the same order of complexity) as the one described in the first paragraph of this section -- see, e.g., [Solomonoff, 1986,Li and Vitányi, 1993].

next up previous
Next: ADAPTIVE LS (ALS) Up: Solving POMDPs with Levin Previous: INTRODUCTION
Juergen Schmidhuber 2003-02-25

Back to Optimal Universal Search page
Back to Reinforcement Learning page
Back to Program Evolution page