next up previous
Next: Outline of OOPS-based Reinforcement Up: Limitations and Possible Extensions Previous: How Often Can we


Fundamental Limitations of OOPS

An appropriate task sequence may help OOPS to reduce the slowdown factor of plain LSEARCH through experience. Given a single task, however, OOPS does not by itself invent an appropriate series of easier subtasks whose solutions should be frozen first. Of course, since both LSEARCH and OOPS may search in general algorithm space, some of the programs they execute may be viewed as self-generated subgoal-definers and subtask solvers. But with a single given task there is no incentive to freeze intermediate solutions before the original task is solved. The potential speed-up of OOPS does stem from exploiting external information encoded within an ordered task sequence. This motivates its very name.

Given some final task, a badly chosen training sequence of intermediate tasks may cost more search time than required for solving just the final task by itself, without any intermediate tasks.

OOPS is designed for resetable environments. In nonresetable environments it loses parts of its theoretical foundation. For example, it is possible to use OOPS for designing optimal trajectories of robot arms in virtual simulations. But once we are working with a real physical robot there may be no guarantee that we will be able to precisely reset it as required by backtracking procedure Try.

OOPS neglects one source of potential speed-up relevant for reinforcement learning [24]: it does not predict future tasks from previous ones, and does not spend a fraction of its time on solving predicted tasks. Such issues will be addressed in the next two subsections on reinforcement learning.


next up previous
Next: Outline of OOPS-based Reinforcement Up: Limitations and Possible Extensions Previous: How Often Can we
Juergen Schmidhuber 2004-04-15

Back to OOPS main page