Abstract
This paper presents an algorithm for the solution of dynamic programming problems requiring the determination of optimal policies for the control of a special class of stochastic processes when the time horizon of the planning period is at infinity. These processes can be mathematically described as discrete time parameter Markov chains with a finite number of states which have been “embedded” in continuous time in the sense that the time between transitions is a random variable whose probability distribution depends only on the states between which the transition takes place. Such processes are called Markov-renewal processes. The Markov processes considered by R. A. Howard in [1] are really two special cases of this somewhat wider class of stochastic processes. In these two special cases, the algorithm of this paper is identical with Howard's. In fact, with only slight modification, Howard's algorithm can be extended to this wider class of stochastic processes.