Next: About this document ...
Up: Reinforcement Learning in Markovian
C. W. Anderson.
Learning and Problem Solving with Multilayer Connectionist
PhD thesis, University of Massachusetts, Dept. of Comp. and Inf.
M. I. Jordan.
Supervised learning and systems with excess degrees of freedom.
Technical Report COINS TR 88-27, Massachusetts Institute of
M. I. Jordan and R. A. Jacobs.
Learning to control an unstable system with forward modeling.
In Proc. of the 1990 Connectionist Models Summer School, in
press. Morgan Kaufmann, 1990.
S. W. Piché.
Draft: First order gradient descent training of adaptive discrete
time dynamic networks.
Technical report, Dept. of Electrical Engineering, Stanford
A. J. Robinson and F. Fallside.
The utility driven dynamic error propagation network.
Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering
T. Robinson and F. Fallside.
Dynamic reinforcement driven error propagation networks with
application to game playing.
In Proceedings of the 11th Conference of the Cognitive Science
Society, Ann Arbor, pages 836-843, 1989.
Making the world differentiable: On using fully recurrent
self-supervised neural networks for dynamic reinforcement learning and
planning in non-stationary environments.
Technical Report FKI-126-90 (revised), Institut für Informatik,
Technische Universität München, November 1990.
(Revised and extended version of an earlier report from February.).
Networks adjusting networks.
In J. Kindermann and A. Linden, editors, Proceedings of
`Distributed Adaptive Neural Information Processing', St.Augustin, 24.-25.5.
1989, pages 197-208. Oldenbourg, 1990.
In November 1990 a revised and extended version appeared as
FKI-Report FKI-125-90 (revised) at the Institut für Informatik,
Technische Universität München.
Towards compositional learning with dynamic neural networks.
Technical Report FKI-129-90, Institut für Informatik, Technische
Universität München, 1990.
R. S. Sutton.
Learning to predict by the methods of temporal differences.
Machine Learning, 3:9-44, 1988.
P. J. Werbos.
Building and understanding adaptive systems: A statistical/numerical
approach to factory automation and brain research.
IEEE Transactions on Systems, Man, and Cybernetics, 17, 1987.
R. J. Williams.
On the use of backpropagation in associative reinforcement learning.
In IEEE International Conference on Neural Networks, San Diego,
volume 2, pages 263-270, 1988.
R. J. Williams and D. Zipser.
Experimental analysis of the real-time recurrent learning algorithm.
Connection Science, 1(1):87-111, 1989.
Back to Reinforcement Learning