Semi-Markov Decision Processes with Unbounded Rewards

1 March 1973

journal article
Published by Institute for Operations Research and the Management Sciences (INFORMS) in Management Science

Vol. 19 (7) , 717-731
https://doi.org/10.1287/mnsc.19.7.717

Abstract

We consider a semi-Markov decision process with arbitrary action space; the state space is the nonnegative integers. As in queueing systems, we assume that {0, 1, 2, …, n + N} is the set of states accessible from state n in one transition, where N is finite and independent of n. The novel feature of this model is that the one-period reward is not required to be uniformly bounded; instead, we merely assume it to be bounded by a polynomial in n. Our main concern is with the average cost problem. A set of conditions sufficient for there to be an optimal stationary policy which can be obtained from the usual functional equation is developed. These conditions are quite weak and, as illustrated in several queueing examples, are easily verified.

Keywords

This publication has 0 references indexed in Scilit: