Abstract
Learning behaviors of a stochastic automaton operating in a multiteacher environment are considered. As a generalized form of the LR-I scheme, the GLR-I scheme is proposed as a reinforcement scheme in a multiteacher environment. It is shown that the GLR-I scheme is absolutely expedient and ϵ-optimal in the general n-teacher environment. Learning behaviors of the GLR-I scheme are simulated by computer and the results indicate the effectiveness of the GLR-I scheme.

This publication has 0 references indexed in Scilit: