next up previous
Next: INTRODUCTION

$\textstyle \parbox{16cm}{
\par
\noindent
{\Large {\bf
REDUCING THE RATIO BETWE...
..., 1993}
\par
\vspace{0.5cm}
\par
\noindent
J. Schmidhuber
\par
\vspace{0.5cm}
}$

ABSTRACT. Let $m$ be the number of time-varying variables for storing temporal events in a fully recurrent sequence processing network. Let $R_{time}$ be the ratio between the number of operations per time step (for an exact gradient based supervised sequence learning algorithm), and $m$. Let $R_{space}$ be the ratio between the maximum number of storage cells necessary for learning arbitrary sequences, and $m$. With conventional recurrent nets, $m$ equals the number of units. With the popular `real time recurrent learning algorithm' (RTRL), $R_{time} = O(m^3)$ and $R_{space} = O(m^2)$. With `back-propagation through time' (BPTT), $R_{time} = O(m)$ (much better than with RTRL) and $R_{space}$ is infinite (much worse than with RTRL). The contribution of this paper is a novel fully recurrent network and a corresponding exact gradient based learning algorithm with $R_{time} = O(m)$ (as good as with BPTT) and $R_{space} = O(m^2)$ (as good as with RTRL).





Juergen Schmidhuber 2003-02-21


Back to Recurrent Neural Networks page