Semi-Markov Decision Processes with Unbounded Rewards
- 1 March 1973
- journal article
- Published by Institute for Operations Research and the Management Sciences (INFORMS) in Management Science
- Vol. 19 (7) , 717-731
- https://doi.org/10.1287/mnsc.19.7.717
Abstract
We consider a semi-Markov decision process with arbitrary action space; the state space is the nonnegative integers. As in queueing systems, we assume that {0, 1, 2, …, n + N} is the set of states accessible from state n in one transition, where N is finite and independent of n. The novel feature of this model is that the one-period reward is not required to be uniformly bounded; instead, we merely assume it to be bounded by a polynomial in n. Our main concern is with the average cost problem. A set of conditions sufficient for there to be an optimal stationary policy which can be obtained from the usual functional equation is developed. These conditions are quite weak and, as illustrated in several queueing examples, are easily verified.Keywords
This publication has 0 references indexed in Scilit: