Next: About this document ...
Up: CURIOUS MODEL-BUILDING CONTROL SYSTEMS
Previous: ACKNOWLEDGEMENTS
- 1
-
A. G. Barto, R. S. Sutton, and C. W. Anderson.
Neuronlike adaptive elements that can solve difficult learning
control problems.
IEEE Transactions on Systems, Man, and Cybernetics,
SMC-13:834-846, 1983.
- 2
-
M. I. Jordan and D. E. Rumelhart.
Supervised learning with a distal teacher.
Technical Report Occasional Paper #40, Center for Cog. Sci.,
Massachusetts Institute of Technology, 1990.
- 3
-
Nguyen and B. Widrow.
The truck backer-upper: An example of self learning in neural
networks.
In IEEE/INNS International Joint Conference on Neural Networks,
Washington, D.C., volume 1, pages 357-364, 1989.
- 4
-
J. H. Schmidhuber.
Dynamische neuronale Netze und das fundamentale raumzeitliche
Lernproblem. Dissertation, Institut für Informatik, Technische
Universität München, 1990.
- 5
-
J. H. Schmidhuber.
Talk at the NIPS'90 workshop on dynamic networks led by R. Rohwer,
1990.
- 6
-
J. H. Schmidhuber.
Adaptive curiosity and adaptive confidence.
Technical Report FKI-149-91, Institut für Informatik, Technische
Universität München, April 1991.
- 7
-
J. H. Schmidhuber.
Adaptive decomposition of time.
In T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors,
Artificial Neural Networks, pages 909-914. Elsevier Science Publishers
B.V., North-Holland, 1991.
- 8
-
J. H. Schmidhuber.
A possibility for implementing curiosity and boredom in
model-building neural controllers.
In J. A. Meyer and S. W. Wilson, editors, Proc. of the
International Conference on Simulation of Adaptive Behavior: From Animals to
Animats, pages 222-227. MIT Press/Bradford Books, 1991.
- 9
-
J. H. Schmidhuber.
Reinforcement learning in markovian and non-markovian environments.
In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 500-506. San
Mateo, CA: Morgan Kaufmann, 1991.
- 10
-
J. H. Schmidhuber and R. Huber.
Learning to generate artificial fovea trajectories for target
detection.
International Journal of Neural Systems, 2(1 & 2):135-141,
1991.
- 11
-
R. S. Sutton.
First results with DYNA, an integrated architecture for learning,
planning and reacting.
In Proceedings of the AAAI Spring Symposium on Planning in
Uncertain, Unpredictable, or Changing Environments, 1990.
- 12
-
S. Thrun and K. Möller.
On planning and exploration in non-discrete environments.
Technical report, Gesellschaft für Mathematik und
Datenverarbeitung, D-5205 St. Augustin, Germany, March 1991.
- 13
-
C. Watkins.
Learning from Delayed Rewards.
PhD thesis, King's College, 1989.
- 14
-
P. J. Werbos.
Beyond Regression: New Tools for Prediction and Analysis in the
Behavioral Sciences.
PhD thesis, Harvard University, 1974.
- 15
-
P. J. Werbos.
Building and understanding adaptive systems: A statistical/numerical
approach to factory automation and brain research.
IEEE Transactions on Systems, Man, and Cybernetics, 17, 1987.
- 16
-
R. J. Williams.
Toward a theory of reinforcement-learning connectionist systems.
Technical Report NU-CCS-88-3, College of Comp. Sci., Northeastern
University, Boston, MA, 1988.
Juergen Schmidhuber
2003-02-28
Back to Active Learning - Exploration - Curiosity page
Back to Reinforcement Learning page