next up previous
Next: ACKNOWLEDGEMENTS Up: A FIXED SIZE STORAGE Previous: THE ALGORITHM

CONCLUDING REMARKS

Like the RTRL-algorithm the method needs a fixed amount of storage of the order $O(n^3)$. Like the RTRL-algorithm (but unlike the methods described in [Williams and Peng, 1990] and [Zipser, 1989]) the algorithm computes the exact gradient. Since it is $O(n)$ times faster than RTRL, it should be preferred.

Following the argumentation in [Williams and Peng, 1990], continuous time versions of BPTT and RTRL [Pearlmutter, 1989] [Gherrity, 1989] can serve as a basis for a correspondingly efficient continuous time version of the algorithm presented here (by means of Euler discretization).

Many typical environments produce input sequences that have both local and more global temporal structure. For instance, input sequences are often hierarchically organized (e.g. speech). In such cases, sequence-composing algorithms [Schmidhuber, 1991] [Schmidhuber, 1992] can provide superior alternatives to pure gradient-based algorithms.



Juergen Schmidhuber 2003-02-13

Back to Recurrent Neural Networks page