The Optimal Reward Operator in Dynamic Programming

Open Access

1 October 1974

journal article
Published by Institute of Mathematical Statistics in The Annals of Probability

Vol. 2 (5) , 926-941
https://doi.org/10.1214/aop/1176996558

Abstract

Consider a dynamic programming problem with analytic state space $S$, analytic constraint set $A$, and semi-analytic reward function $r(x, P, y)$ for $(x, P)\in A$ and $y\in S$: namely, $\{r > a\}$ is an analytic set for all $a$. Let $Tf$ be the optimal reward in one move, with the modified reward function $r(x, P, y) + f(y)$. The optimal reward in $n$ moves is shown to be $T^n0$, a semi-analytic function on $S$. It is also shown that for any $n$ and positive $\varepsilon$, there is an $\varepsilon$-optimal strategy for the $n$-move game, measurable on the $\sigma$-field generated by the analytic sets.

Keywords

GAME
VAREPSILON
FUNCTION
SIGMA
DYNAMIC PROGRAMMING
OPTIMAL REWARD

This publication has 0 references indexed in Scilit: