Gödel Machine vs OOPS-RL

Next: Gödel Machine vs AIXI Up: More Relations to Previous Previous: Gödel Machine vs Success-Story

Gödel Machine vs OOPS-RL

The Optimal Ordered Problem Solver OOPS [38,40] (used by BIOPS in Section 2.3) is a bias-optimal (see Def. 2.1) way of searching for a program that solves each problem in an ordered sequence of problems of a reasonably general type, continually organizing and managing and reusing earlier acquired knowledge. Solomonoff recently also proposed related ideas for a scientist's assistant [48] that modifies the probability distribution of universal search [22] based on experience.

As pointed out earlier [38] (section on OOPS limitations), however, OOPS-like methods are not directly applicable to general lifelong reinforcement learning tasks such as those for which AIXI [15] was designed. But it is possible to use two OOPS-modules as components of a rather general reinforcement learner (OOPS-RL), one module learning a predictive model of the environment, the other one using this world model to search for an action sequence maximizing expected reward [38,41]. Despite the bias-optimality properties of OOPS for certain ordered task sequences, however, OOPS-RL is not necessarily the best way of spending limited time in general reinforcement learning situations [18], such as the ones where the Gödel machine is optimal in the sense of its utility function.

Next: Gödel Machine vs AIXI Up: More Relations to Previous Previous: Gödel Machine vs Success-Story

Juergen Schmidhuber 2003-09-29

Back to Goedel machine home page