Plausibility of Speed Prior

Starting with the traditional case of recursive priors, the subsections
above discussed more and more dominant priors as candidates for the
one from which our universe is sampled. Now we will move towards the
other extreme: the less dominant prior *S* which in a sense
is optimal with respect to temporal complexity.

So far, without much ado, we have used a terminology according to
which we ``draw a universe from a particular prior distribution.''
In the TM-based set-ups (see Def. 4.12) this in principle
requires a ``binary oracle,'' a source of true randomness, to provide
the TM's inputs. Any source of randomness, however, leaves us with an
unsatisfactory explanation of the universe, since random strings do not
have a compact explanation, by definition. The obvious way around this,
already implicit in the definition of
(see Def. 4.17), is
the ``ensemble approach'' which runs *all* possible TM input programs and
sums over the lengths of those that compute strings starting with *x*.

Once we deal with ensemble approaches and explicit computations in general, however, we are forced to accept their fundamental time constraints. As mentioned above, many of the shortest programs of certain enumerable or describable strings compute their outputs more slowly than any recursive upper bound could indicate.

If we do assume that time complexity of the computation should be an issue, then why stop with the somewhat arbitrary restriction of recursiveness, which just says that the time required to compute something should be computable by a halting program? Similarly, why stop with the somewhat arbitrary restriction of polynomial time bounds which are subject of much of the work in theoretical computer science?

If I were a ``Great Programmer'' [#!Schmidhuber:97brauer!#] with
substantial computing resources, perhaps beyond those possible in our own
universe which apparently does not permit more than 10^{51} operations
per second and kilogram [#!Bremermann:82!#,#!Lloyd:00!#], yet constrained by the
fundamental limits of computability, I would opt for the fastest way of
simulating all universes, represented by algorithm **FAST**
(Section 6). Similarly,
if I were to search for some computable object with certain properties
discoverable by Levin's universal search algorithm (the ``mother'' of
**FAST**), I would use the latter for its optimality properties.

Consider the observers evolving in the many different possible universes
computed by **FAST** or as a by-product of Levin Search. Some of
them would be identical, at least for some time,
collecting identical experiences in universes with possibly equal
beginnings yet possibly different futures. At a given time, the most
likely instances of a particular observer *A* would essentially be
determined by the fastest way of computing *A*.

Observer *A* might adopt the belief the Great Programmer was indeed smart
enough to implement the most efficient way of computing everything.
And given *A*'s very existence, *A* can conclude that the Great
Programmer's resources are sufficient to compute at least one instance
of *A*. What *A* does not know, however, is the current phase of **
FAST**, or whether the Great Programmer is interested in or aware of *A*,
or whether *A* is just an accidental by-product of some Great Programmer's
search for something else, etc.

Here is where a resource-oriented bias comes in naturally. It seems to
make sense for *A* to assume
that the Great Programmer is also bound by the limits of computability,
that infinitely late phases of **FAST**
consuming uncountable resources are infinitely unlikely,
that any Great Programmer's *a priori* probability of
investing computational resources into some search problem
tends to decrease with growing search costs,
and that the prior probability of anything whose computation
requires more than *O*(*n*) resources by the optimal method
is indeed inversely proportional to *n*. This immediately
leads to the speed prior *S*.

Believing in *S*, *A* could use Theorem 6.1 to predict
the future (or ``postdict'' unknown aspects of the past)
by assigning highest probability to those S-describable
futures (or pasts) that are (a) consistent with *A*'s experiences and (b) are
computable by short and fast algorithms. The appropriate simplicity
measure minimized by this resource-oriented version of Occam's razor is
the Levin complexity *Kt*.

Related links: In the beginning was the code! - Zuse's thesis - Life, the universe, and everything - Generalized Algorithmic Information - Speed Prior - The New AI