Average reward reinforcement learning: Foundations, algorithms, and empirical results
- 1 January 1996
- journal article
- Published by Springer Nature in Machine Learning
- Vol. 22 (1-3) , 159-195
- https://doi.org/10.1007/bf00114727
Abstract
No abstract availableKeywords
This publication has 28 references indexed in Scilit:
- Residual Algorithms: Reinforcement Learning with Function ApproximationPublished by Elsevier ,1995
- Learning to act using real-time dynamic programmingArtificial Intelligence, 1995
- An improved algorithm for solving communicating average reward Markov decision processesAnnals of Operations Research, 1991
- A distributed asynchronous algorithm for expected average cost dynamic programmingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1990
- Robotics in ServicePublished by Springer Nature ,1989
- Successive Approximation Methods for Solving Nested Functional Equations in Markov Decision ProblemsMathematics of Operations Research, 1984
- Distributed dynamic programmingIEEE Transactions on Automatic Control, 1982
- A Modified Form of the Iterative Method of Dynamic ProgrammingThe Annals of Statistics, 1975
- Computing a Bias-Optimal Policy in a Discrete-Time Markov Decision ProblemOperations Research, 1970
- Discrete Dynamic ProgrammingThe Annals of Mathematical Statistics, 1962