Next: THE ALGORITHM Up: A FIXED SIZE STORAGE Previous: A FIXED SIZE STORAGE

INTRODUCTION

There are two basic methods for performing steepest descent in fully recurrent networks with

non-input units and

input units. `Back propagation through time' (BPTT) (e.g. [Williams and Peng, 1990]) requires potentially unlimited storage in proportion to the length of the longest training sequence but needs only

computations per time step. BPTT is the method of choice if training sequences are known to have less than

time steps. For training sequences involving many more time steps than

, for training sequences of unknown length, and for on-line learning in general one would like to have an algorithm with upper bounds for storage and for computations required per time step. Such an algorithm is the RTRL-algorithm [Robinson and Fallside, 1987][Williams and Zipser, 1989]. It requires only fixed-size storage of the order

but is computationally expensive: It requires

operations per time step³. The algorithm described herein requires

storage, too. Every

time steps it requires

operations, but on all other time steps it requires only

operations. This cuts the average time complexity per time step to

Juergen Schmidhuber 2003-02-13

Back to Recurrent Neural Networks page