Reinforcement learning is direct adaptive optimal control
- 1 April 1992
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Control Systems
- Vol. 12 (2) , 19-22
- https://doi.org/10.1109/37.126844
Abstract
Neural network reinforcement learning methods are described and considered as a direct approach to adaptive optimal control of nonlinear systems. These methods have their roots in studies of animal learning and in early learning control work. An emerging deeper understanding of these methods is summarized that is obtained by viewing them as a synthesis of dynamic programming and stochastic approximation methods. The focus is on Q-learning systems, which maintain estimates of utilities for all state-action pairs and make use of these estimates to select actions. The use of hybrid direct/indirect methods is briefly discussed.Keywords
This publication has 20 references indexed in Scilit:
- Computationally efficient adaptive control algorithms for Markov chainsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Neural networks for control and system identificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Learning to perceive and act by trial and errorMachine Learning, 1991
- Learning sequential decision rules using simulation models and competitionMachine Learning, 1990
- Real-time heuristic searchArtificial Intelligence, 1990
- Learning to predict by the methods of temporal differencesMachine Learning, 1988
- Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain ResearchIEEE Transactions on Systems, Man, and Cybernetics, 1987
- Decentralized learning in finite Markov chainsIEEE Transactions on Automatic Control, 1986
- Adaptive control of Markov chains, I: Finite parameter setIEEE Transactions on Automatic Control, 1979
- An adaptive optimal controller for discrete-time Markov environmentsInformation and Control, 1977