Neither Levin's universal search  nor its incremental extension, the Optimal Ordered Problem Solver [38,40], nor Solomonoff's recent ideas  are `universal enough' for such general setups, and our earlier self-modifying online learning systems [29,32,46,45,47] are not necessarily optimal. Hutter's recent AIXI model  does execute optimal actions in very general environments evolving according to arbitrary, unknown, yet computable probabilistic laws, but only under the unrealistic assumption of unlimited computation time. AIXI's asymptotically optimal, space/time-bounded cousin AIXI(t,l)  may be the system conceptually closest to the one pesented here. In discrete cycle of AIXI(t,l)'s lifetime, action results in perception and reward , where all quantities may depend on the complete history. Using a universal computer such as a Turing machine , AIXI(t,l) needs an initial offline setup phase (prior to interaction with the environment) to examine all proofs of length at most , filtering out those that identify programs (of maximal size and maximal runtime per cycle) which not only could interact with the environment but which for all possible interaction histories also correctly predict a lower bound of their own expected reward. In cycle , AIXI(t,l) then runs all programs identified in the setup phase (at most ), finds the one with highest self-rating, and executes its corresponding action. The problem-independent setup time (where almost all of the work is done) is , and the online computation time per cycle is .
AIXI(t,l) is related to Hutter's `fastest' algorithm for all well-defined problems (HSEARCH ) which also uses a general proof searcher, and also is asymptotically optimal in a certain sense. Assume discrete input/output domains , a formal problem specification (say, a functional description of how integers are decomposed into their prime factors), and a particular (say, an integer to be factorized). HSEARCH orders all proofs of an appropriate axiomatic system by size to find programs that for all provably compute within time bound . Simultaneously it spends most of its time on executing the with the best currently proven time bound . It turns out that HSEARCH is as fast as the fastest algorithm that provably computes for all , save for a constant factor smaller than (arbitrary ) and an -specific but -independent additive constant . That is, HSEARCH and AIXI(t,l) boast an optimal order of complexity. This somewhat limited notion of optimality, however, can be misleading despite its wide use in theoretical computer science, as it hides the possibly huge but problem-independent constants which could make AIXI(t,l) and HSEARCH practically infeasible.