Abstract
This note points out that upper and lower bounds on the optimal value function of a finite discounted Markov decision problem can be computed easily when the problem is solved by linear programming or policy iteration. These bounds can be used to identify suboptimal actions.

This publication has 0 references indexed in Scilit: