HQ-Learning
- 1 September 1997
- journal article
- research article
- Published by SAGE Publications in Adaptive Behavior
- Vol. 6 (2) , 219-246
- https://doi.org/10.1177/105971239700600202
Abstract
HQ-learning is a hierarchical extension of Q(λ)-learning designed to solve certain types of partially observable Markov decision problems (POMDPs). HQ automatically decomposes POMDPs into sequences of simpler subtasks that can be solved by memoryless policies learnable by reactive subagents. HQ can solve partially observable mazes with more states than those used in most previous POMDP work.Keywords
This publication has 16 references indexed in Scilit:
- Long Short-Term MemoryNeural Computation, 1997
- Incremental multi-step Q-learningMachine Learning, 1996
- The effect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithmsMachine Learning, 1996
- Reinforcement learning of multiple tasks using a hierarchical CMAC architectureRobotics and Autonomous Systems, 1995
- Classifier Fitness Based on AccuracyEvolutionary Computation, 1995
- Adding Temporary Memory to ZCSAdaptive Behavior, 1994
- ZCS: A Zeroth Level Classifier SystemEvolutionary Computation, 1994
- Prioritized sweeping: Reinforcement learning with less data and less timeMachine Learning, 1993
- Q-learningMachine Learning, 1992
- Learning Complex, Extended Sequences Using the Principle of History CompressionNeural Computation, 1992