Computable Predictions through the Speed Prior Based on the Fastest Way of Describing Objects

Unfortunately, while and the more general priors of
Section 4
are computable in
the limit, they are not recursive, and thus
practically infeasible. This drawback inspired less general yet
practically more feasible principles of minimum description length
(MDL) [71,41] as well as priors derived from
time-bounded restrictions [31]
of Kolmogorov complexity
[28,62,9].
No particular instance of these approaches, however, is universally
accepted or has a general convincing motivation that carries beyond rather
specialized application scenarios. For instance, typical efficient MDL
approaches require the specification of a class of computable models of
the data, say, certain types of neural networks, plus some computable
loss function expressing the coding costs of the data relative to the
model. This provokes numerous *ad-hoc* choices.

Our recent work [54], however,
offers an alternative to the celebrated
but noncomputable algorithmic simplicity measure
or Solomonoff-Levin measure discussed above
[62,77,63].
We introduced a new measure (a prior on the computable objects)
which is not based
on the **shortest** but on the **fastest** way of describing objects.

Let us assume that the observed data sequence is generated by a computational process, and that any possible sequence of observations is therefore computable in the limit [50]. This assumption is stronger and more radical than the traditional one: Solomonoff just insists that the probability of any sequence prefix is recursively computable, but the (infinite) sequence itself may still be generated probabilistically.

Given our starting assumption that data are deterministically generated by a machine, it seems plausible that the machine suffers from a computational resource problem. Since some things are much harder to compute than others, the resource-oriented point of view suggests the following postulate.

Initialize:Set Let the input scanning head of a universal TM point to the first cell of its initially empty input tape.

Forever repeat:While the number of instructions executed so far exceeds : toss an unbiased coin; if heads is up set ; otherwise exit. If the input scanning head points to a cell that already contains a bit, execute the corresponding instruction (of the growing self-delimiting program, e.g., [30,31]). Else toss the coin again, set the cell's bit to 1 if heads is up (0 otherwise), and set

Algorithm **GUESS** is very similar to a probabilistic search
algorithm used in previous work on applied inductive inference
[47,49].
On several toy problems it generalized
extremely well in a way unmatchable by traditional neural network
learning algorithms.

With comes a computable method **AS** for predicting
optimally within accuracy [54].
Consider a finite but unknown program computing
.
What if Postulate 1 holds but is not optimally
efficient, and/or computed on a computer that differs from
our reference machine? Then we effectively do not sample
beginnings from but from an alternative semimeasure .
Can we still predict well? Yes, because the
Speed Prior dominates .
This dominance is all we need to apply the recent loss bounds
[21].
The loss that we are expected to receive
by predicting according to **AS**
instead of using the true but unknown does not exceed
the optimal loss by much [54].