Average reward reinforcement learning: Foundations, algorithms, and empirical results

Publisher Website

1 January 1996

journal article
Published by Springer Nature in Machine Learning

Vol. 22 (1-3) , 159-195
https://doi.org/10.1007/bf00114727

Abstract

No abstract available

Keywords

This publication has 28 references indexed in Scilit:

Residual Algorithms: Reinforcement Learning with Function Approximation
Published by Elsevier ,1995
Learning to act using real-time dynamic programming
Artificial Intelligence, 1995
An improved algorithm for solving communicating average reward Markov decision processes
Annals of Operations Research, 1991
A distributed asynchronous algorithm for expected average cost dynamic programming
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1990
Robotics in Service
Published by Springer Nature ,1989
Successive Approximation Methods for Solving Nested Functional Equations in Markov Decision Problems
Mathematics of Operations Research, 1984
Distributed dynamic programming
IEEE Transactions on Automatic Control, 1982
A Modified Form of the Iterative Method of Dynamic Programming
The Annals of Statistics, 1975
Computing a Bias-Optimal Policy in a Discrete-Time Markov Decision Problem
Operations Research, 1970
Discrete Dynamic Programming
The Annals of Mathematical Statistics, 1962