next up previous
Next: About this document ... Up: Solving POMDPs with Levin Previous: Acknowledgements

Bibliography

Cliff and Ross, 1994
Cliff, D. and Ross, S. (1994).
Adding temporary memory to ZCS.
Adaptive Behavior, 3:101-150.

Dickmanns et al., 1987
Dickmanns, D., Schmidhuber, J., and Winklhofer, A. (1987).
Der genetische Algorithmus: Eine Implementierung in Prolog. Fortgeschrittenenpraktikum, Institut für Informatik, Lehrstuhl Prof. Radig, Technische Universität München.

Jaakkola et al., 1995
Jaakkola, T., Singh, S. P., and Jordan, M. I. (1995).
Reinforcement learning algorithm for partially observable Markov decision problems.
In Tesauro, G., Touretzky, D. S., and Leen, T. K., editors, Advances in Neural Information Processing Systems 7, pages 345-352. MIT Press.

Kaelbling, 1993
Kaelbling, L. (1993).
Learning in Embedded Systems.
MIT Press.

Kaelbling et al., 1995
Kaelbling, L., Littman, M., and Cassandra, A. (1995).
Planning and acting in partially observable stochastic domains.
Technical report, Brown University, Providence RI.

Levin, 1973
Levin, L. A. (1973).
Universal sequential search problems.
Problems of Information Transmission, 9(3):265-266.

Levin, 1984
Levin, L. A. (1984).
Randomness conservation inequalities: Information and independence in mathematical theories.
Information and Control, 61:15-37.

Li and Vitányi, 1993
Li, M. and Vitányi, P. M. B. (1993).
An Introduction to Kolmogorov Complexity and its Applications.
Springer.

Littman, 1994
Littman, M. (1994).
Memoryless policies: Theoretical limitations and practical results.
In D. Cliff, P. Husbands, J. A. M. and Wilson, S. W., editors, Proc. of the International Conference on Simulation of Adaptive Behavior: From Animals to Animats 3, pages 297-305. MIT Press/Bradford Books.

McCallum, 1993
McCallum, R. A. (1993).
Overcoming incomplete perception with utile distinction memory.
In Machine Learning: Proceedings of the Tenth International Conference. Morgan Kaufmann, Amherst, MA.

McCallum, 1995
McCallum, R. A. (1995).
Instance-based utile distinctions for reinforcement learning with hidden state.
In Prieditis, A. and Russell, S., editors, Machine Learning: Proceedings of the Twelfth International Conference, pages 387-395. Morgan Kaufmann Publishers, San Francisco, CA.

Ring, 1994
Ring, M. B. (1994).
Continual Learning in Reinforcement Environments.
PhD thesis, University of Texas at Austin, Austin, Texas 78712.

Schmidhuber, 1995a
Schmidhuber, J. (1995a).
Discovering solutions with low Kolmogorov complexity and high generalization capability.
In Prieditis, A. and Russell, S., editors, Machine Learning: Proceedings of the Twelfth International Conference, pages 488-496. Morgan Kaufmann Publishers, San Francisco, CA.

Schmidhuber, 1995b
Schmidhuber, J. (1995b).
Environment-independent reinforcement acceleration.
Technical Report Note IDSIA-59-95, IDSIA.
Invited talk at Hongkong University of Science and Technology.

Solomonoff, 1986
Solomonoff, R. (1986).
An application of algorithmic probability to problems in artificial intelligence.
In Kanal, L. N. and Lemmer, J. F., editors, Uncertainty in Artificial Intelligence, pages 473-491. Elsevier Science Publishers.

Watanabe, 1992
Watanabe, O. (1992).
Kolmogorov complexity and computational complexity.
EATCS Monographs on Theoretical Computer Science, Springer.

Watkins and Dayan, 1992
Watkins, C. J. C. H. and Dayan, P. (1992).
Q-learning.
Machine Learning, 8:279-292.



Juergen Schmidhuber 2003-02-25


Back to Optimal Universal Search page
Back to Reinforcement Learning page
Back to Program Evolution page