Optimal decision procedures for finite markov chains. Part I: Examples
- 1 August 1973
- journal article
- research article
- Published by Cambridge University Press (CUP) in Advances in Applied Probability
- Vol. 5 (02) , 328-339
- https://doi.org/10.1017/s0001867800039203
Abstract
A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a prescribed set depending on the state occupied at any time. Given the immediate cost for each choice, it is required to minimise the expected cost over an infinite future, without discounting. Various techniques are reviewed for the case when there is a finite set of possible transition matrices and an example is given to illustrate the unpredictable behaviour of policy sequences derived by backward induction. Further examples show that the existing methods may break down when there is an infinite family of transition matrices. A new approach is suggested, based on the idea of classifying the states according to their accessibility from one another.Keywords
This publication has 5 references indexed in Scilit:
- Discrete Dynamic Programming with Sensitive Discount Optimality CriteriaThe Annals of Mathematical Statistics, 1969
- Discrete Dynamic Programming with a Small Interest RateThe Annals of Mathematical Statistics, 1969
- On Finding Optimal Policies in Discrete Dynamic Programming with No DiscountingThe Annals of Mathematical Statistics, 1966
- On the Iterative Method of Dynamic Programming on a Finite Space Discrete Time Markov ProcessThe Annals of Mathematical Statistics, 1965
- Discrete Dynamic ProgrammingThe Annals of Mathematical Statistics, 1962