The absolutely expedient nonlinear reinforcement schemes under the unknown multiteacher environment
- 1 January 1983
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Systems, Man, and Cybernetics
- Vol. SMC-13 (1) , 100-108
- https://doi.org/10.1109/TSMC.1983.6313039
Abstract
Learning behaviours of variable-structure stochastic automata under a multiteacher environment are considered. The concepts of absolute expediency and ε-optimality in a single-teacher environment are extended by the introduction of an average weighted reward and are redefined for a multiteacher environment. As an extended form of the absolutely expedient learning algorithm, a general class of nonlinear learning algorithm, called the GAE scheme, is proposed as a reinforcement scheme in a multiteacher environment. It is shown that the GAE scheme is absolutely expedient and ε-optimal in the general n-teacher environment. Learning behaviours of the GAE scheme in various multiteacher environments are simulated by computer and the results indicate the effectiveness of the GAE scheme.Keywords
This publication has 0 references indexed in Scilit: