next up previous
Next: 1. INTRODUCTION

REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NON-DETERMINISTIC ENVIRONMENTS

Jürgen Schmidhuber, Jan Storck, Josef Hochreiter, TUM

In Proc. ICANN'95, vol. 2, pages 159-164. EC2 & CIE, Paris, 1995.

Abstract:

For an agent living in a non-deterministic Markov environment (NME), what is, in theory, the fastest way of acquiring information about its statistical properties? The answer is: To design ``optimal'' sequences of ``experiments'' by performing action sequences that maximize expected information gain. This notion is implemented by combining concepts from information theory and reinforcement learning. Experiments show that the resulting method, reinforcement driven information acquisition, can explore certain NMEs much faster than conventional random exploration.




Keywords: Exploration, reinforcement learning, Q-learning, information gain, maximum likelihood models, non-deterministic Markovian environments, reinforcement directed information acquisition.





Juergen Schmidhuber 2003-02-28


Back to Active Learning - Exploration - Curiosity page
Back to Reinforcement Learning page