Reinforcement learning is direct adaptive optimal control

1 April 1992

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Control Systems

Vol. 12 (2) , 19-22
https://doi.org/10.1109/37.126844

Abstract

Neural network reinforcement learning methods are described and considered as a direct approach to adaptive optimal control of nonlinear systems. These methods have their roots in studies of animal learning and in early learning control work. An emerging deeper understanding of these methods is summarized that is obtained by viewing them as a synthesis of dynamic programming and stochastic approximation methods. The focus is on Q-learning systems, which maintain estimates of utilities for all state-action pairs and make use of these estimates to select actions. The use of hybrid direct/indirect methods is briefly discussed.

Keywords

This publication has 20 references indexed in Scilit:

Computationally efficient adaptive control algorithms for Markov chains
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Neural networks for control and system identification
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Learning to perceive and act by trial and error
Machine Learning, 1991
Learning sequential decision rules using simulation models and competition
Machine Learning, 1990
Real-time heuristic search
Artificial Intelligence, 1990
Learning to predict by the methods of temporal differences
Machine Learning, 1988
Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research
IEEE Transactions on Systems, Man, and Cybernetics, 1987
Decentralized learning in finite Markov chains
IEEE Transactions on Automatic Control, 1986
Adaptive control of Markov chains, I: Finite parameter set
IEEE Transactions on Automatic Control, 1979
An adaptive optimal controller for discrete-time Markov environments
Information and Control, 1977