next up previous
Next: A TYPICAL TASK Up: PLANNING SIMPLE TRAJECTORIES USING Previous: PLANNING SIMPLE TRAJECTORIES USING

INTRODUCTION

Many researchers in neuro-control and reinforcement learning believe that some `compositional' method for learning to reach new goals by combining familiar action sequences into more complex new action sequences is necessary to overcome scaling problems associated with non-compositional algorithms.

The few previous ideas for attacking `compositional neural sequence learning' are inspired by dynamic programming and involve reinforcement learning networks arranged in a hierarchical fashion (e.g. [Watkins, 1989], [Jameson, 1991], [Singh, 1992], see also [Ring, 1991] for alternative ideas).

Our approach is entirely different from previous approaches. It is based on some initial ideas presented in [Schmidhuber, 1991a]. We describe gradient-based procedures for transforming knowledge about previously learned action sequences into appropriate subgoals for new problems. No external teacher is required. Our approach is limited, however, in the sense that it relies on differentiable (possibly adaptive) models of the costs associated with known action sequences.



Juergen Schmidhuber 2003-03-14

Back to Subgoal learning - Hierarchical Learning
Pages with Subgoal learning pictures