The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms
- 1 March 1978
- journal article
- research article
- Published by Cambridge University Press (CUP) in Journal of Applied Probability
- Vol. 15 (02) , 356-373
- https://doi.org/10.1017/s0021900200045630
Abstract
This paper is concerned with the optimality equation for the average costs in a denumerable state semi-Markov decision model. It will be shown that under each of a number of recurrency conditions on the transition probability matrices associated with the stationary policies, the optimality equation has a bounded solution. This solution indeed yields a stationary policy which is optimal for a strong version of the average cost optimality criterion. Besides the existence of a bounded solution to the optimality equation, we will show that both the value-iteration method and the policy-iteration method can be used to determine such a solution. For the latter method we will prove that the average costs and the relative cost functions of the policies generated converge to a solution of the optimality equation.Keywords
This publication has 8 references indexed in Scilit:
- Exponential convergence of products of stochastic matricesJournal of Mathematical Analysis and Applications, 1977
- Markov decision chains with unbounded costs and applications to the control of queuesAdvances in Applied Probability, 1976
- On Dynamic Programming with Unbounded RewardsManagement Science, 1975
- The asymptotic behaviour of the minimal total expected cost for the denumerable state Markov decision modelJournal of Applied Probability, 1975
- Iterative solution of the functional equations of undiscounted Markov renewal programmingJournal of Mathematical Analysis and Applications, 1971
- A Solution to a Countable System of Equations Arising in Markovian Decision ProcessesThe Annals of Mathematical Statistics, 1967
- Denumerable State Markovian Decision Processes-Average Cost CriterionThe Annals of Mathematical Statistics, 1966
- Weak ergodicity in non-homogeneous Markov chainsMathematical Proceedings of the Cambridge Philosophical Society, 1958