Let us assume that the observed data sequence is generated by a computational process, and that any possible sequence of observations is therefore computable in the limit [22].
This assumption is stronger and more radical than the traditional one: Solomonoff just insists that the probability of any sequence prefix is recursively computable, but the (infinite) sequence itself may still be generated probabilistically.
Under our starting assumption that data are deterministically generated by a machine, it seems plausible that the machine suffers from a computational resource problem. Since some things are much harder to compute than others, the resource-oriented point of view suggests the following postulate.
To add some flesh to this postulate, we introduce the asymptotically fastest way of computing all computable data, for our particular universal reference TM:
FAST Algorithm (version 1): For perform PHASE :
PHASE : Execute instructions of all program prefixes satisfying , and sequentially write the outputs on adjacent sections of the output tape, separated by blanks.
Following Levin [14][15, p. 502-505],
within TM steps
FAST will generate all prefixes
satisfying
, where 's Levin complexity is
defined as
As noted by Hutter [11], there is an essentially equivalent version of FAST which makes very clear just how simple the asymptotically optimal method really is:
FAST (version 2): Execute one instruction of the -th program every steps on average, using blanks to separate program outputs.To see how we can obtain universal search [14][15, p. 502-505] from FAST, suppose we are seeking the solution to some inversion problem . For instance, might be a path through a maze providing reward , while for uneffective paths . Then we can use FAST to work through all programs and check for each whether it generates a path yielding reward. This will identify each reward-generating solution as quickly as the fastest algorithm that generates and tests that particular solution, save for a constant factor (which may be huge). Compare this to Hutter's recent more complex search algorithm for all well-defined problems [11] which reduces the unknown multiplicative constant factor to a remarkably small known value, namely, 5 -- at the expense of introducing an unknown, problem class-specific, additive constant.
How does the optimal algorithm tie in with Postulate 1? Since the most efficient way of computing all is embodied by FAST, which computes each as quickly as 's fastest algorithm, save for a constant factor, and since each PHASE of FAST (version 1) consumes roughly twice the time and space resources of the previous PHASE, the cumulative prior probability of things first computed in any particular PHASE should be roughly half the one of the previous PHASE; zero probability should be assigned to infinitely resource-consuming PHASEs. Postulate 1 therefore suggests Definition 2 below.
We observe that is a semimeasure -- compare [15]:
The very fact that a speed prior can be defined in a meaningful way is interesting in itself, and may also be relevant in some practical situations -- see Section 5.