Theorems for EOMs and GTMs

Next: Tighter Bounds? Up: Probability vs Descriptive Complexity Previous: Probability vs Descriptive Complexity

Theorems for EOMs and GTMs

Theorem 5.3 For $x \in B^{\sharp}$ with P^E(x) > 0,

$\begin{displaymath}KP^E(x) - 1 \leq K^E(x) \leq KP^E(x) + K^E(KP^E(x)) + O(1). \end{displaymath}$

(38)

Using $K^E(y) \leq log~y + 2log~log~y + O(1)$ for y interpreted as an integer -- compare Def. 2.8 -- this yields

$\begin{displaymath}2^{-K^E(x)} < P^E(x) \leq O(2^{-K^E(x)})(K^E(x))^2. \end{displaymath}$

(39)

That is, objects that are hard to describe (in the sense that they have only long enumerating descriptions) have low probability.

Proof. The left-hand inequality follows by definition. To show the right-hand side, one can build an EOM T that computes $x \in B^{\sharp}$ using not more than KP^E(x) + K_T(KP^E(x)) + O(1) input bits in a way inspired by Huffman-Coding [#!Huffman:52!#]. The claim then follows from the Invariance Theorem. The trick is to arrange T's computation such that T's output converges yet never needs to decrease lexicographically. T works as follows:

(A) Emulate U^E to construct a real enumerable number 0.s encoded as a self-delimiting input program r, simultaneously run all (possibly forever running) programs on U^E dovetail style; whenever the output of a prefix q of any running program starts with some $x \in B^*$ for the first time, set variable V(x) := V(x) + 2^-l(q) (if no program has ever created output starting with x then first create V(x) initialized by 0); whenever the output of some extension q' of q (obtained by possibly reading additional input bits: q'=q if none are read) lexicographically increases such that it does not equal x any more, set V(x) := V(x) - 2^-l(q').
(B) Simultaneously, starting at the right end of the unit interval [0,1), as the V(x) are being updated, keep updating a chain of disjoint, consecutive, adjacent, half-open (at the right end) intervals IV(x) = [LV(x), RV(x)) of size V(x) = RV(x) - LV(x) in alphabetic order on x, such that the right end of the IV(x) of the largest x coincides with the right end of [0,1), and IV(y) is to the right of IV(x) if $y \succ x$ . After every variable update and each change of s, replace the output of T by the x of the IV(x) with $0.s \in IV(x)$ .

This will never violate the EOM constraints: the enumerable s cannot shrink, and since EOM outputs cannot decrease lexicographically, the interval boundaries RV(x) and LV(x) cannot grow (their negations are enumerable, compare Lemma 4.1), hence T's output cannot decrease.

For $x \in B^*$ the IV(x) converge towards an interval I(x) of size P^E(x). For $x \in B^{\infty}$ with P^E(x) > 0, we have: for any $\epsilon > 0$ there is a time t₀ such that for all time steps t>t₀ in T's computation, an interval $I_{\epsilon}(x)$ of size $P^E(x) - \epsilon$ will be completely covered by certain IV(y) satisfying $x \succ y$ and $0.x - 0.y < \epsilon$ . So for $\epsilon \rightarrow 0$ the $I_{\epsilon}(x)$ also converge towards an interval I(x) of size P^E(x). Hence T will output larger and larger y approximating x from below, provided $0.s \in I(x)$ .

Since any interval of size c within [0,1) contains a number 0.z with l(z) = -lg c, in both cases there is a number 0.s (encodable by some r satisfying $r \leq l(s) + K_T(l(s)) + O(1))$ ) with l(s) = -lg P^E(x) + O(1), such that $T(r) \leadsto x$ , and therefore $K_T(x) \leq l(s) + K_T(l(s)) + O(1)$ . $\Box$

Less symmetric statements can also be derived in very similar fashion:

Theorem 5.4 Let TM T induce approximable CP_T(x) for all $x \in B^*$ (compare Defs. 4.10 and 4.12; an EOM would be a special case). Then for $x \in B^{\sharp}$ , P_T(x) > 0:

$\begin{displaymath}K^G(x) \leq KP_T(x) + K^G(KP_T(x)) + O(1). \end{displaymath}$

(40)

Proof. Modify the proof of Theorem 5.3 for approximable as opposed to enumerable interval boundaries and approximable 0.s. $\Box$

A similar proof, but without the complication for the case $x \in B^{\infty}$ , yields:

Theorem 5.5 Let $\mu$ denote an approximable semimeasure on $x \in B^*$ ; that is, $\mu(x)$ is describable. Then for $\mu(x) > 0$ :

$\begin{displaymath}Km^G(x) \leq K\mu(x) + Km^G(K\mu(x)) + O(1); \end{displaymath}$

(41)

$\begin{displaymath}K^G(x) \leq K\bar{\mu}(x) + K^G(K\bar{\mu}(x)) + O(1). \end{displaymath}$

(42)

As a consequence,

$\begin{displaymath}\frac{\mu(x)} {K\mu(x)log^2K\mu(x)} \leq O(2^{-Km^G(x)}); ... ...(x)} {K\bar{\mu}(x)log^2K\bar{\mu}(x)} \leq O(2^{-K^G(x)}). \end{displaymath}$

(43)

Proof. Initialize variables $V_{\lambda} := 1$ and $IV_{\lambda} := [0,1)$ . Dovetailing over all $x \succ \lambda$ , approximate the GTM-computable $\bar{\mu}(x) = \mu(x)-\mu(x0) -\mu(x1)$ in variables V_x initialized by zero, and create a chain of adjacent intervals IV_x analogously to the proof of Theorem 5.3.

The IV_x converge against intervals I_x of size $\bar{\mu}(x)$ . Hence x is GTM-encodable by any program r producing an output s with $0.s \in I_x$ : after every update, replace the GTM's output by the x of the IV_x with $0.s \in IV_x$ . Similarly, if 0.s is in the union of adjacent intervals I_y of strings y starting with x, then the GTM's output will converge towards some string starting with x. The rest follows in a way similar to the one described in the final paragraph of the proof of Theorem 5.3. $\Box$

Using the basic ideas in the proofs of Theorem 5.3 and 5.5 in conjunction with Corollary 4.3 and Lemma 4.2, one can also obtain statements such as:

Theorem 5.6 Let $\mu_0$ denote the universal CEM from Theorem 4.1. For $x \in B^*$ ,

$\begin{displaymath}K\mu_0(x) - O(1) \leq Km^E(x) \leq K\mu_0(x) + Km^E(K\mu_0(x)) + O(1) \end{displaymath}$

(44)

While P^E dominates P^M and P^G dominates P^E, the reverse statements are not true. In fact, given the results from Sections 3.2 and 5, one can now make claims such as the following ones:

Corollary 5.1 The following functions are unbounded:

$\begin{displaymath}\frac{\mu^E(x)}{\mu^M(x)}; ~~ \frac{P^E(x)}{P^M(x)}; ~~ \frac{P^G(x)}{P^E(x)}. \end{displaymath}$

Proof. For the cases $\mu^E$ and P^E, apply Theorems 5.2, 5.6 and the unboundedness of (12). For the case P^G, apply Theorems 3.3 and 5.3.

Next: Tighter Bounds? Up: Probability vs Descriptive Complexity Previous: Probability vs Descriptive Complexity

Juergen Schmidhuber
2001-01-09