Training: forget minimal time lags > 10!
So why study RNNs at all?
Hope for generalizing from short exemplars? Sometimes justified, often not.
To overcome long time lag problem: history compression in RNN hierarchy - level n gets unpredictable inputs from level n-1 (Schmidhuber, NIPS 91, Neural Computation 1992)
Other 1990s ideas: Mozer, Ring, Bengio, Frasconi, Giles, Omlin, Sun, ...
Back to J. Schmidhuber's Recurrent neural network page