Abstract
In a simply connected Markov renewal problem, each state is either transient under all policies or an element of a single chain under some policy. This property is easily verified; it implies invariance of the maximal long-term average return (gain) with respect to the initial state, which in turn assures convergence of Odoni's bounds in the damped value-iteration algorithm due to Schweitzer, even when the maximal-gain process is multiple-chained and/or periodic.

This publication has 0 references indexed in Scilit: