Marco Wiering
Jürgen Schmidhuber
IDSIA, http://www.idsia.ch
All images at the end of the bibliography!
HQ-learning is a hierarchical extension of Q()-learning
designed to solve certain types of partially observable Markov decision
problems (POMDPs).
HQ automatically decomposes POMDPs into sequences of simpler subtasks
that can be solved by memoryless policies learnable by reactive subagents.
HQ can solve partially observable mazes with more states than those used
in most previous POMDP work.
Keywords: reinforcement learning, hierarchical Q-learning, POMDPs, non-Markov, subgoal learning.
Running head: HQ-Learning