Adaptive Policies for Markov Renewal Programs

Abstract
We recast a class of denumerable-state, infinite-action Markov renewal programs with unknown parameters as one-state programs with actions corresponding to stationary policies in the original program. Under suitable conditions we find an adaptive (nonstationary) optimal policy in the sense of maximizing long-run expected reward per unit time.

This publication has 0 references indexed in Scilit: