Recurrent Neural Networks from Learning Attractor Dynamics
Stefan Schaal
Computer Science & Neuroscience, Univ. of Southern California, USA
ATR Computational Neuroscience Laboratory, Japan
Many forms of recurrent neural networks can be understood in terms of
dynamic systems theory of difference equations or differential
equations. Learning in such systems corresponds to adjusting some
internal parameters to obtain a desired time evolution of the network,
which can usually be characterized in term of point attractor dynamics,
limit cycle dynamics, or, in some more rare cases, as strange attractor
or chaotic dynamics. Finding a stable learning process to adjust the
open parameters of the network towards shaping the desired attractor
type and basin of attraction has remain a complex task, as the
parameter trajectories during learning can lead the system through a
variety of undesirable unstable behaviors, such that learning may never
succeed.
In this presentation, we review a recently developed learning framework
for a class of recurrent neural networks that employs a more structured
network approach. We assume that the canonical system behavior is known
a priori, e.g., it is a point attractor or a limit cycle. With either
supervised learning or reinforcement learning, it is possible to
acquire the transformation from a simple representative of this
canonical behavior (e.g., a 2nd order linear point attractor, or a
simple limit cycle oscillator) to the desired highly complex attractor
form. For supervised learning, one shot learning based on locally
weighted regression techniques is possible. For reinforcement learning,
stochastic policy gradient techniques can be employed. In any case, the
recurrent network learned by these methods inherits the stability
properties of the simple dynamic system that underlies the nonlinear
transformation, such that stability of the learning approach is not a
problem. We demonstrate the success of this approach for learning
various skills on a humanoid robot, including tasks that require to
incorporate additional sensory signals as coupling terms to modify the
recurrent network evolution on-line.