Plausibility of Speed Prior S next up previous contents
Next: S-Based Predictions Up: Consequences for Physics Previous: Plausibility of Approximable Priors

Plausibility of Speed Prior S

Starting with the traditional case of recursive priors, the subsections above discussed more and more dominant priors as candidates for the one from which our universe is sampled. Now we will move towards the other extreme: the less dominant prior S which in a sense is optimal with respect to temporal complexity.

So far, without much ado, we have used a terminology according to which we ``draw a universe from a particular prior distribution.'' In the TM-based set-ups (see Def. 4.12) this in principle requires a ``binary oracle,'' a source of true randomness, to provide the TM's inputs. Any source of randomness, however, leaves us with an unsatisfactory explanation of the universe, since random strings do not have a compact explanation, by definition. The obvious way around this, already implicit in the definition of $\mu_T(x)$ (see Def. 4.17), is the ``ensemble approach'' which runs all possible TM input programs and sums over the lengths of those that compute strings starting with x.

Once we deal with ensemble approaches and explicit computations in general, however, we are forced to accept their fundamental time constraints. As mentioned above, many of the shortest programs of certain enumerable or describable strings compute their outputs more slowly than any recursive upper bound could indicate.

If we do assume that time complexity of the computation should be an issue, then why stop with the somewhat arbitrary restriction of recursiveness, which just says that the time required to compute something should be computable by a halting program? Similarly, why stop with the somewhat arbitrary restriction of polynomial time bounds which are subject of much of the work in theoretical computer science?

If I were a ``Great Programmer'' [#!Schmidhuber:97brauer!#] with substantial computing resources, perhaps beyond those possible in our own universe which apparently does not permit more than 1051 operations per second and kilogram [#!Bremermann:82!#,#!Lloyd:00!#], yet constrained by the fundamental limits of computability, I would opt for the fastest way of simulating all universes, represented by algorithm FAST (Section 6). Similarly, if I were to search for some computable object with certain properties discoverable by Levin's universal search algorithm (the ``mother'' of FAST), I would use the latter for its optimality properties.

Consider the observers evolving in the many different possible universes computed by FAST or as a by-product of Levin Search. Some of them would be identical, at least for some time, collecting identical experiences in universes with possibly equal beginnings yet possibly different futures. At a given time, the most likely instances of a particular observer A would essentially be determined by the fastest way of computing A.

Observer A might adopt the belief the Great Programmer was indeed smart enough to implement the most efficient way of computing everything. And given A's very existence, A can conclude that the Great Programmer's resources are sufficient to compute at least one instance of A. What A does not know, however, is the current phase of FAST, or whether the Great Programmer is interested in or aware of A, or whether A is just an accidental by-product of some Great Programmer's search for something else, etc.

Here is where a resource-oriented bias comes in naturally. It seems to make sense for A to assume that the Great Programmer is also bound by the limits of computability, that infinitely late phases of FAST consuming uncountable resources are infinitely unlikely, that any Great Programmer's a priori probability of investing computational resources into some search problem tends to decrease with growing search costs, and that the prior probability of anything whose computation requires more than O(n) resources by the optimal method is indeed inversely proportional to n. This immediately leads to the speed prior S.

Believing in S, A could use Theorem 6.1 to predict the future (or ``postdict'' unknown aspects of the past) by assigning highest probability to those S-describable futures (or pasts) that are (a) consistent with A's experiences and (b) are computable by short and fast algorithms. The appropriate simplicity measure minimized by this resource-oriented version of Occam's razor is the Levin complexity Kt.

next up previous contents
Next: S-Based Predictions Up: Consequences for Physics Previous: Plausibility of Approximable Priors
Juergen Schmidhuber

Related links: In the beginning was the code! - Zuse's thesis - Life, the universe, and everything - Generalized Algorithmic Information - Speed Prior - The New AI