Next: Time constants
Up: Gradient Flow in Recurrent
Previous: Dilemma: Avoiding gradient decay
Remedies
The above theoretical investigations indicate a basic limitation of
gradient descent as a search procedure for finding optimal weights in
a RNN. Several proposals have been made to cope with the problem of
long-term dependencies, some attempting to solve the optimization
problem using alternative search algorithms, other trying to devise
alternative architectures. In the following we give a brief accounts
of these proposals.
Subsections
Juergen Schmidhuber
2003-02-19