TM-Induced Distributions and Convergence Probability next up previous contents
Next: Universal TM-Induced Measures Up: Measures and Probability Distributions Previous: Approximable and Cumulatively Enumerable

TM-Induced Distributions and Convergence Probability

Suppose TM T's input bits are obtained by tossing an unbiased coin whenever a new one is requested. Levin's universal discrete enumerable semimeasure [#!Levin:73a!#,#!Chaitin:75!#,#!Gacs:74!#] or semidistribution m is limited to B* and halting programs:

Definition 4.11 (m)  

\begin{displaymath}m(x) = \sum_{p: T(p)=x} 2^{-l(p)};
\end{displaymath} (25)

Note that $\sum_x m(x) < 1$ if T universal. Let us now generalize this to $B^{\sharp}$ and nonhalting programs:

Definition 4.12 (tex2html_wrap_inline$P_T, KP_T$)   Suppose T's input bits are obtained by tossing an unbiased coin whenever a new one is requested.

\begin{displaymath}P_T(x) = \sum_{p: T(p) \leadsto x} 2^{-l(p)},~~
KP_T(x) = -lg P_T(x)~for~P_T(x)>0,
\end{displaymath} (26)

where $x, p \in B^{\sharp}$.

Program Continua. According to Def. 4.12, most infinite x have zero probability, but not those with finite programs, such as the dyadic expansion of $0.5 \sqrt{2}$. However, a nonvanishing part of the entire unit of probability mass is contributed by continua of mostly incompressible strings, such as those with cumulative probability 2-l(q) computed by the following class of uncountably many infinite programs with a common finite prefix q: ``repeat forever: read and print next input bit.'' The corresponding traditional measure-oriented notation for

\begin{displaymath}\sum_{x: T(qx) \leadsto x} 2^{-l(qx)} = 2^{-l(q)}
\end{displaymath}

would be

\begin{displaymath}\int_{0.q}^{0.q + 2^{-l(q)}} dx = 2^{-l(q)}.
\end{displaymath}

For notational simplicity, however, we will continue using the $\sum$ sign to indicate summation over uncountable objects, rather than using a measure-oriented notation for probability densities. The reader should not feel uncomfortable with this -- the theorems in the remainder of the paper will focus on those $x \in B^{\sharp}$ with P(x)>0; density-like nonzero sums over uncountably many bitsrings, each with individual measure zero, will not play any critical role in the proofs.

Definition 4.13 (Universal TM-Induced Distributions tex2html_wrap_inline$P^C;KP^C$)   If C denotes a set of TMs with universal element UC, then we write

PC(x) = PUC(x);    KPC(x) := -lg PC(x)  for  PC(x)>0. (27)

We have PC(x) > 0 for $D_C \subset B^{\sharp}$, the subset of C-describable $x \in B^{\sharp}$. The attribute universal is justified, because of the dominance PT(x) = O(PC(x)), due to the Invariance Theorem (compare Def. 3.3).

Since all programs of EOMs and MTMs converge, PE and PM are proper probability distributions on $B^{\sharp}$. For instance, $\sum_x P^E(x) = 1$. PG, however, is just a semidistribution. To obtain a proper probability distribution PNT, one might think of normalizing by the convergence probability $\Upsilon$:

Definition 4.14 (Convergence Probability)   Given GTM T, define

\begin{displaymath}PN_T(x) = \frac{\sum_{T(p) \leadsto x} 2^{-l(p)}}
{\Upsilon^T},
\end{displaymath}

where

\begin{displaymath}\Upsilon^T = \sum_{p: \exists x: T(p) \leadsto x} 2^{-l(p)}.
\end{displaymath}

Describability issues. Uniquely encode each TM T as a finite bitstring, and identify M, E, G with the corresponding sets of bitstrings. While the function $f^M: M \rightarrow B^{\sharp}: f(T) = \Omega^T$ is describable, even enumerable, the function $f^G: G \rightarrow B^{\sharp}: f(T) = \Upsilon^T$ is not, essentially due to Theorem 2.1.

Even PE(x) and PM(x) are generally not describable for $x \in B^{\sharp}$, in the sense that there is no GTM T that takes as an input a finite description (or program) of any M-describable or E-describable $x \in B^{\sharp}$ and converges towards PM(x) or PE(x). This is because in general it is not even weakly decidable (Def. 2.11) whether two programs compute the same output. If we know that one of the program outputs is finite, however, then the conditions of weak decidability are fulfilled. Hence certain TM-induced distributions on B* are describable, as will be seen next.

Definition 4.15 (TM-Induced Cumulative Distributions)   If C denotes a set of TMs with universal element UC, then we write (compare Def. 4.10):

CPC(x) = CPUC(x). (28)

Lemma 4.1   For $x \in B^*$, CPE(x) is enumerable.

Proof. The following algorithm computes CPE(x) (compare proof of Theorem 3.3):

Initialize the real-valued variable V by 0, run all possible programs of EOM T dovetail style; whenever the output of a program prefix q starts with some $y \succeq x$ for the first time, set V:= V+ 2-l(q); henceforth ignore continuations of q.
In this way V enumerates CPE(x). Infinite p are not problematic as only a finite prefix of p must be read to establish $y \succeq x$ if the latter indeed holds. $\Box$

Similarly, facts of the form $y \succ x \in B^*$ can be discovered after finite time.

Corollary 4.1   For $x \in B^*$, PE(x) is approximable or describable as the difference of two enumerable values:

 \begin{displaymath}P^E(x) = \sum_{y \succeq x} P^E(y) - \sum_{y \succ x} P^E(y),
\end{displaymath} (29)

Now we will make the connection to the previous subsection on semimeasures on B*.


next up previous contents
Next: Universal TM-Induced Measures Up: Measures and Probability Distributions Previous: Approximable and Cumulatively Enumerable
Juergen Schmidhuber
2001-01-09


Related links: In the beginning was the code! - Zuse's thesis - Life, the universe, and everything - Generalized Algorithmic Information - Speed Prior - The New AI