Next: INTRODUCTION

HQ-Learning
Adaptive Behavior 6(2):219-246, 1997

Marco Wiering
Jürgen Schmidhuber
IDSIA, http://www.idsia.ch

All images at the end of the bibliography!

Abstract:

HQ-learning is a hierarchical extension of Q( $\lambda$ )-learning designed to solve certain types of partially observable Markov decision problems (POMDPs). HQ automatically decomposes POMDPs into sequences of simpler subtasks that can be solved by memoryless policies learnable by reactive subagents. HQ can solve partially observable mazes with more states than those used in most previous POMDP work.

Keywords: reinforcement learning, hierarchical Q-learning, POMDPs, non-Markov, subgoal learning.

Running head: HQ-Learning

INTRODUCTION
HQ-LEARNING
- LEARNING RULES
EXPERIMENTS
- LEARNING TO SOLVE A PARTIALLY OBSERVABLE MAZE
- THE KEY AND THE DOOR
PREVIOUS WORK
HQ'S ADVANTAGES AND LIMITATIONS
CONCLUSION
ACKNOWLEDGMENTS
Bibliography
About this document ...

Juergen Schmidhuber 2003-02-24

Back to Reinforcement Learning and POMDP page
Back to Subgoal Learning - Hierarchical Learning

HQ-Learning Adaptive Behavior 6(2):219-246, 1997

Abstract:

HQ-Learning
Adaptive Behavior 6(2):219-246, 1997