Speed Prior-Based Inductive Inference

Next: Practical Applications of Algorithm Up: Temporal Complexity Previous: Speed Prior S and

Speed Prior-Based Inductive Inference

Given S, as we observe an initial segment $x \in B^*$ of some string, which is the most likely continuation? Consider x's finite continuations $xy, y \in B^*$ . According to Bayes (compare Equation (15)),

$\begin{displaymath}S(xy \mid x) = \frac{S(x \mid xy) S(xy)} {S(x)} = \frac{S(xy)} {S(x)}, \end{displaymath}$

(48)

where $S(z^2 \mid z^1)$ is the measure of z², given z¹. Having observed x we will predict those y that maximize $S(xy \mid x)$ . Which are those? In what follows, we will confirm the intuition that for $n \rightarrow \infty$ the only probable continuations of x_n are those with fast programs. The sheer number of ``slowly'' computable strings cannot balance the speed advantage of ``more quickly'' computable strings with equal beginnings.

Definition 6.4 (tex2html_wrap_inline$p < k_i x$ etc.) Write $p \stackrel{< k}{\longrightarrow} x$ if finite program p ( $p \to x$ ) computes x within less than k steps, and $p \stackrel{< k}{\longrightarrow_i} x$ if it does so within PHASE i of FAST. Similarly for $p \stackrel{\leq k}{\longrightarrow} x$ and $p \stackrel{\leq k}{\longrightarrow_i} x$ (at most k steps), $p \stackrel{= k}{\longrightarrow} x$ , (exactly k steps), $p \stackrel{\geq k}{\longrightarrow} x$ , (at least k steps), $p \stackrel{> k}{\longrightarrow} x$ (more than k steps).

Theorem 6.1 Suppose $x \in B^{\infty}$ is S-describable, and $p^x \in B^*$ outputs x_n within at most f(n) steps for all n, and g(n) > O(f(n)). Then

$\begin{displaymath}Q(x,g,f):= lim_{n \to \infty} \frac{ \sum_{i=1}^{\infty} 2^{... ...p \stackrel{\leq f(n)}{\longrightarrow_i} x_n} 2^{-l(p)}} = 0. \end{displaymath}$

Proof. Since no program that requires at least g(n) steps for producing x_n can compute x_n in a phase with number < log g(n), we have

$\begin{displaymath}Q(x,g,f) \leq lim_{n \to \infty} \frac {\sum_{i=1}^{\infty} ... ...m_{p \stackrel{= f(n)}{\longrightarrow_i} x_n}2^{-l(p)}} \leq \end{displaymath}$

$\begin{displaymath}lim_{n \to \infty} \frac{ f(n) \sum_{p \to x_n} 2^{-l(p)} } ... ...ty} \frac{ f(n) } { g(n) } \frac{ 1 } { 2^{-l(p^x)} } = 0. \end{displaymath}$

Here we have used the Kraft inequality [#!Kraft:49!#] to obtain a rough upper bound for the enumerator: when no p is prefix of another one, then $\sum_{p} 2^{-l(p)} \leq 1.$ $\Box$

Hence, if we know a rather fast finite program p^x for x, then Theorem 6.1 allows for predicting: if we observe some x_n (n sufficiently large) then it is very unlikely that it was produced by an x-computing algorithm much slower than p^x.

Among the fastest algorithms for x is FAST itself, which is at least as fast as p^x, save for a constant factor. It outputs x_n after O(2^Kt(x_n)) steps. Therefore Theorem 6.1 tells us:

Corollary 6.1 Let $x \in B^{\infty}$ be S-describable. For $n \to \infty$ , with probability 1 the continuation of x_n is computable within O(2^Kt(x_n)) steps.

Given observation x with $l(x) \to \infty$ , we predict a continuation y with minimal Kt(xy).

Example 6.1 Consider Example 1.2 and Equation (1). According to the weak anthropic principle, the conditional probability of a particular observer finding herself in one of the universes compatible with her existence equals 1. Given S, we predict a universe with minimal Kt. Short futures are more likely than long ones: the probability that the universe's history so far will extend beyond the one computable in the current phase of FAST (that is, it will be prolongated into the next phase) is at most 50 %. Infinite futures have measure zero.

Next: Practical Applications of Algorithm Up: Temporal Complexity Previous: Speed Prior S and

Juergen Schmidhuber
2001-01-09