Next: THE ALGORITHM
Up: A FIXED SIZE STORAGE
Previous: A FIXED SIZE STORAGE
There are two basic methods for performing steepest descent in fully
recurrent networks with non-input units and input units.
`Back propagation through time' (BPTT)
(e.g. [Williams and Peng, 1990]) requires potentially
unlimited storage in proportion to the
length of the longest training sequence but needs
only computations per time step.
BPTT is the method of
choice if training sequences are known to have less
than time steps.
For training sequences involving many more
time steps than , for training sequences of unknown length, and
for on-line learning in general one
would like to have an algorithm with upper bounds for storage and for
computations required per time step.
Such an algorithm is the RTRL-algorithm
[Robinson and Fallside, 1987][Williams and Zipser, 1989]. It
requires only fixed-size storage of the order
but is computationally expensive: It requires
operations per time step3.
The algorithm described herein requires
storage, too.
Every time steps
it requires operations, but on all other
time steps it requires only operations.
This cuts the average
time complexity per time step to .
Juergen Schmidhuber
2003-02-13
Back to Recurrent Neural Networks page