Next: About this document ...
Up: Solving POMDPs with Levin
Previous: Acknowledgements
- Cliff and Ross, 1994
-
Cliff, D. and Ross, S. (1994).
Adding temporary memory to ZCS.
Adaptive Behavior, 3:101-150.
- Dickmanns et al., 1987
-
Dickmanns, D., Schmidhuber, J., and Winklhofer, A. (1987).
Der genetische Algorithmus: Eine Implementierung in Prolog.
Fortgeschrittenenpraktikum, Institut für Informatik, Lehrstuhl Prof.
Radig, Technische Universität München.
- Jaakkola et al., 1995
-
Jaakkola, T., Singh, S. P., and Jordan, M. I. (1995).
Reinforcement learning algorithm for partially observable Markov
decision problems.
In Tesauro, G., Touretzky, D. S., and Leen, T. K., editors, Advances in Neural Information Processing Systems 7, pages 345-352. MIT
Press.
- Kaelbling, 1993
-
Kaelbling, L. (1993).
Learning in Embedded Systems.
MIT Press.
- Kaelbling et al., 1995
-
Kaelbling, L., Littman, M., and Cassandra, A. (1995).
Planning and acting in partially observable stochastic domains.
Technical report, Brown University, Providence RI.
- Levin, 1973
-
Levin, L. A. (1973).
Universal sequential search problems.
Problems of Information Transmission, 9(3):265-266.
- Levin, 1984
-
Levin, L. A. (1984).
Randomness conservation inequalities: Information and independence in
mathematical theories.
Information and Control, 61:15-37.
- Li and Vitányi, 1993
-
Li, M. and Vitányi, P. M. B. (1993).
An Introduction to Kolmogorov Complexity and its
Applications.
Springer.
- Littman, 1994
-
Littman, M. (1994).
Memoryless policies: Theoretical limitations and practical results.
In D. Cliff, P. Husbands, J. A. M. and Wilson, S. W., editors, Proc. of the International Conference on Simulation of Adaptive Behavior:
From Animals to Animats 3, pages 297-305. MIT Press/Bradford Books.
- McCallum, 1993
-
McCallum, R. A. (1993).
Overcoming incomplete perception with utile distinction memory.
In Machine Learning: Proceedings of the Tenth International
Conference. Morgan Kaufmann, Amherst, MA.
- McCallum, 1995
-
McCallum, R. A. (1995).
Instance-based utile distinctions for reinforcement learning with
hidden state.
In Prieditis, A. and Russell, S., editors, Machine Learning:
Proceedings of the Twelfth International Conference, pages 387-395. Morgan
Kaufmann Publishers, San Francisco, CA.
- Ring, 1994
-
Ring, M. B. (1994).
Continual Learning in Reinforcement Environments.
PhD thesis, University of Texas at Austin, Austin, Texas 78712.
- Schmidhuber, 1995a
-
Schmidhuber, J. (1995a).
Discovering solutions with low Kolmogorov complexity and high
generalization capability.
In Prieditis, A. and Russell, S., editors, Machine Learning:
Proceedings of the Twelfth International Conference, pages 488-496. Morgan
Kaufmann Publishers, San Francisco, CA.
- Schmidhuber, 1995b
-
Schmidhuber, J. (1995b).
Environment-independent reinforcement acceleration.
Technical Report Note IDSIA-59-95, IDSIA.
Invited talk at Hongkong University of Science and Technology.
- Solomonoff, 1986
-
Solomonoff, R. (1986).
An application of algorithmic probability to problems in artificial
intelligence.
In Kanal, L. N. and Lemmer, J. F., editors, Uncertainty in
Artificial Intelligence, pages 473-491. Elsevier Science Publishers.
- Watanabe, 1992
-
Watanabe, O. (1992).
Kolmogorov complexity and computational complexity.
EATCS Monographs on Theoretical Computer Science, Springer.
- Watkins and Dayan, 1992
-
Watkins, C. J. C. H. and Dayan, P. (1992).
Q-learning.
Machine Learning, 8:279-292.
Juergen Schmidhuber
2003-02-25
Back to Optimal Universal Search page
Back to Reinforcement Learning page
Back to Program Evolution page