next up previous
Next: About this document ... Up: LSTM CAN SOLVE HARD Previous: ACKNOWLEDGMENTS

Bibliography

1
Y. Bengio and P. Frasconi.
Credit assignment through time: Alternatives to backpropagation.
In J. D. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural Information Processing Systems 6, pages 75-82. Morgan Kaufmann, 1994.

2
Y. Bengio and P. Frasconi.
An input output HMM architecture.
In G. Tesauro, D. S. Touretzky, and T. K. Leen, editors, Advances in Neural Information Processing Systems 7, pages 427-434. MIT Press, 1995.

3
Y. Bengio, P. Simard, and P. Frasconi.
Learning long-term dependencies with gradient descent is difficult.
IEEE Transactions on Neural Networks, 5(2):157-166, 1994.

4
A. Cleeremans, D. Servan-Schreiber, and J. L. McClelland.
Finite-state automata and simple recurrent networks.
Neural Computation, 1:372-381, 1989.

5
S. E. Fahlman.
The recurrent cascade-correlation learning algorithm.
In R. P. Lippmann, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 190-196. Morgan Kaufmann, 1991.

6
S. E. Hihi and Y. Bengio.
Hierarchical recurrent neural networks for long-term dependencies.
In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 493-499. MIT Press, 1996.

7
S. Hochreiter.
Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, Institut für Informatik, Lehrstuhl Prof. Brauer, Technische Universität München, 1991.
See www7.informatik.tu-muenchen.de/~hochreit.

8
S. Hochreiter and J. Schmidhuber.
Long short-term memory.
Technical Report FKI-207-95, Fakultät für Informatik, Technische Universität München, 1995.
Revised 1996 (see www.idsia.ch/~juergen, www7.informatik.tu-muenchen.de/~hochreit).

9
T. Lin, B. G. Horne, P. Tino, and C. L. Giles.
Learning long-term dependencies is not as difficult with NARX recurrent neural networks.
Technical Report UMIACS-TR-95-78 and CS-TR-3500, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, 1995.

10
P. Manolios and R. Fanelli.
First-order recurrent neural networks and deterministic finite state automata.
Neural Computation, 6:1155-1173, 1994.

11
C. B. Miller and C. L. Giles.
Experimental comparison of the effect of order in recurrent neural networks.
International Journal of Pattern Recognition and Artificial Intelligence, 7(4):849-872, 1993.

12
M. C. Mozer.
Induction of multiscale temporal structure.
In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 4, pages 275-282. Morgan Kaufmann, 1992.

13
B. A. Pearlmutter.
Gradient calculations for dynamic recurrent neural networks: A survey.
IEEE Transactions on Neural Networks, 6(5):1212-1228, 1995.

14
J. B. Pollack.
The induction of dynamical recognizers.
Machine Learning, 7:227-252, 1991.

15
A. J. Robinson and F. Fallside.
The utility driven dynamic error propagation network.
Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering Department, 1987.

16
J. Schmidhuber.
Learning complex, extended sequences using the principle of history compression.
Neural Computation, 4(2):234-242, 1992.

17
A. W. Smith and D. Zipser.
Learning sequential structures with the real-time recurrent learning algorithm.
International Journal of Neural Systems, 1(2):125-131, 1989.

18
M. Tomita.
Dynamic construction of finite automata from examples using hill-climbing.
In Proceedings of the Fourth Annual Cognitive Science Conference, pages 105-108. Ann Arbor, MI, 1982.

19
R. L. Watrous and G. M. Kuhn.
Induction of finite-state automata using second-order recurrent networks.
In J. E. Moody, S. J. Hanson, and R. P. Lippman, editors, Advances in Neural Information Processing Systems 4, pages 309-316. Morgan Kaufmann, 1992.

20
R. J. Williams and J. Peng.
An efficient gradient-based algorithm for on-line training of recurrent network trajectories.
Neural Computation, 4:491-501, 1990.


Juergen Schmidhuber 2003-02-25


Back to Recurrent Neural Networks page