next up previous


Jürgen Schmidhuber, TU Munich, Univ. of Colorado at Boulder

In Proc. International Joint Conference on Neural Networks, Singapore, volume 2, pages 1458-1463. IEEE, 1991.


A controller is a device which receives inputs from a (dynamic) environment and produces outputs that manipulate the environmental state. A model-building control system is a controller with an additional module (the `world model') which is trained to predict future inputs from previous input/action pairs. The novel curious model-building control system described in this paper is a model-building control system which actively tries to provoke situations for which it learned to expect to learn something about the environment. Such a system has been implemented as a 4-network system based on Watkins' Q-learning algorithm which can be used to maximize the expectation of the temporal derivative of the adaptive assumed reliability of future predictions. An experiment with an artificial non-deterministic environment demonstrates that the system can be superior to previous model-building control systems (the latter do not address the problem of modelling the reliability of the world model's predictions in uncertain environments and use ad-hoc methods (like random search) to train the world model).

next up previous
Juergen Schmidhuber 2003-02-28

Back to Active Learning - Exploration - Curiosity page
Back to Reinforcement Learning page