COMMENTS

Next: APPLICATION: FINDING ``SIMPLE'' NEURAL Up: PROBABILISTIC SEARCH FOR USEFUL Previous: PROBABILISTIC SEARCH FOR USEFUL

COMMENTS

1. Universality. It is not difficult to show that the above primitives form a universal set in the following sense: they can be composed to form programs writing any computable integer sequence onto the work tape (within the given size and range limitations). Note that the primitives make it easy to create programs for handling stacks, recursion, etc. A program may create executable code (or sub-programs) on the work tape. Since executable code is represented as a sequence of numbers, the code may modify itself. The scheme allows for very general sequential interaction with the environment (given appropriate problem-specific actions translating storage contents into output actions and environmental changes).

2. Self-sizing programs. The whole set-up is an adaptation of the Turing machine from section 2. It is designed to be more efficient in the sense that programs may exploit what conventional digital machines are good at: fast storage addressing, jumping, etc.²Using a single array of cells with negative addresses for the work tape and positive addresses for the program tape considerably simplifies the primitives. At first glance, there seems to be a price to pay for the general addressing capabilities: random jumps are unlikely to contribute to successful programs if there are many possible legal addresses to jump to. However, since programs may influence their own size and thus the number of available legal addresses, they have the potential to remain small and ``likely,'' as will be seen next.

How do programs influence their own size? They can keep small by (1) avoiding requests for new oracles (e.g., by avoiding jumps to the current ), and (2) by using Allocate and Free in a balanced way. Oracle requests, Allocate and Free provide the only ways of influencing the number of ``visible'' legal addresses available in used storage. The oracle requests are the only source of randomness, however. If the current program does not request many oracles, its space probability will tend to remain low, although the program may perform extensive computations. The bigger the used storage, however, the smaller the probability of guessing a particular ``visible'' address, and the less likely the arguments of instructions (like ``Jumleq'') generated by future oracle requests. Small is beautiful.

3. Reruns. Since the program tape is ``read/execute'' only, we can rerun a randomly chosen halting program written onto the program tape by erasing the work tape, making the InstructionPointer equal to 0, and starting the execution loop. The contents of the program tape completely determine the contents of the work tape at any given time (unless the environment reacts in a non-deterministic way). The desire for being able to rerun programs is also the reason for the way oracle requests are handled. One might have introduced an extra primitive ``RequestOracle'' calling for a random instruction to appear in some location, say, at the end of the used program tape. But then different reruns would have led to different oracles, in general. The way it is done, oracle requests are generated by any operation moving the InstructionPointer to the top of the used program tape (the ), and any oracle is executed immediately. (Thus, any legal instruction appearing on the program tape out of the blue is executed at least once. One might say that ``randomness is not wasted'' but handled efficiently.) During reruns, the same moves of the InstructionPointer will not provoke oracle requests. This is because what used to be the at some point of guessing the program, now is just another address with fixed contents. During reruns, there is absolute determinism (unless the environment reacts in a non-deterministic way, of course).

4. Probabilistic setting. Why use a probabilistic search algorithm instead of the original universal search procedure? Why create time limits and programs randomly instead of systematically enumerating them? One reason is to avoid unintended bias. For instance, unintended bias may be introduced by imposing a systematic (say, alphabetic) order among programs with equal quotients of probability and runtime. A drawback of the probabilistic version above, however, is that programs with low Levin complexity (in general) will be tested more than once.

When speed is an issue, then we will prefer systematic enumeration, or a slightly more complicated probabilistic variant whose expected search time equals the one of systematic enumeration. Variants of systematic universal search were implemented in collaboration with Stefan Heil (1995) and Marco Wiering (1996). With the examples below, however, total search time is not the main issue: the simulations in the next section (based on probabilistic search) are intended to highlight generalization performance, not speed. Very similar results were obtained by systematic search, however.

Next: APPLICATION: FINDING ``SIMPLE'' NEURAL Up: PROBABILISTIC SEARCH FOR USEFUL Previous: PROBABILISTIC SEARCH FOR USEFUL

Juergen Schmidhuber 2003-02-12

Back to Optimal Universal Search page
Back to Program Evolution page
Back to Algorithmic Information page
Back to Speed Prior page