Bibliography

Next: About this document ... Up: LSTM CAN SOLVE HARD Previous: ACKNOWLEDGMENTS

Bibliography

1: Y. Bengio and P. Frasconi.
Credit assignment through time: Alternatives to backpropagation.
In J. D. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural Information Processing Systems 6, pages 75-82. Morgan Kaufmann, 1994.
2: Y. Bengio and P. Frasconi.
An input output HMM architecture.
In G. Tesauro, D. S. Touretzky, and T. K. Leen, editors, Advances in Neural Information Processing Systems 7, pages 427-434. MIT Press, 1995.
3: Y. Bengio, P. Simard, and P. Frasconi.
Learning long-term dependencies with gradient descent is difficult.
IEEE Transactions on Neural Networks, 5(2):157-166, 1994.
4: A. Cleeremans, D. Servan-Schreiber, and J. L. McClelland.
Finite-state automata and simple recurrent networks.
Neural Computation, 1:372-381, 1989.
5: S. E. Fahlman.
The recurrent cascade-correlation learning algorithm.
In R. P. Lippmann, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 190-196. Morgan Kaufmann, 1991.
6: S. E. Hihi and Y. Bengio.
Hierarchical recurrent neural networks for long-term dependencies.
In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 493-499. MIT Press, 1996.
7: S. Hochreiter.
Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, Institut für Informatik, Lehrstuhl Prof. Brauer, Technische Universität München, 1991.
See www7.informatik.tu-muenchen.de/~hochreit.
8: S. Hochreiter and J. Schmidhuber.
Long short-term memory.
Technical Report FKI-207-95, Fakultät für Informatik, Technische Universität München, 1995.
Revised 1996 (see www.idsia.ch/~juergen, www7.informatik.tu-muenchen.de/~hochreit).
9: T. Lin, B. G. Horne, P. Tino, and C. L. Giles.
Learning long-term dependencies is not as difficult with NARX recurrent neural networks.
Technical Report UMIACS-TR-95-78 and CS-TR-3500, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, 1995.
10: P. Manolios and R. Fanelli.
First-order recurrent neural networks and deterministic finite state automata.
Neural Computation, 6:1155-1173, 1994.
11: C. B. Miller and C. L. Giles.
Experimental comparison of the effect of order in recurrent neural networks.
International Journal of Pattern Recognition and Artificial Intelligence, 7(4):849-872, 1993.
12: M. C. Mozer.
Induction of multiscale temporal structure.
In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 4, pages 275-282. Morgan Kaufmann, 1992.
13: B. A. Pearlmutter.
Gradient calculations for dynamic recurrent neural networks: A survey.
IEEE Transactions on Neural Networks, 6(5):1212-1228, 1995.
14: J. B. Pollack.
The induction of dynamical recognizers.
Machine Learning, 7:227-252, 1991.
15: A. J. Robinson and F. Fallside.
The utility driven dynamic error propagation network.
Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering Department, 1987.
16: J. Schmidhuber.
Learning complex, extended sequences using the principle of history compression.
Neural Computation, 4(2):234-242, 1992.
17: A. W. Smith and D. Zipser.
Learning sequential structures with the real-time recurrent learning algorithm.
International Journal of Neural Systems, 1(2):125-131, 1989.
18: M. Tomita.
Dynamic construction of finite automata from examples using hill-climbing.
In Proceedings of the Fourth Annual Cognitive Science Conference, pages 105-108. Ann Arbor, MI, 1982.
19: R. L. Watrous and G. M. Kuhn.
Induction of finite-state automata using second-order recurrent networks.
In J. E. Moody, S. J. Hanson, and R. P. Lippman, editors, Advances in Neural Information Processing Systems 4, pages 309-316. Morgan Kaufmann, 1992.
20: R. J. Williams and J. Peng.
An efficient gradient-based algorithm for on-line training of recurrent network trajectories.
Neural Computation, 4:491-501, 1990.

Juergen Schmidhuber 2003-02-25

Back to Recurrent Neural Networks page