Abstract
The authors introduce a novel method for designing AI (artificial intelligence)-based controllers using approximate reasoning and reinforcement learning. The approach uses linguistic control rules obtained from human expert controllers and a form of reinforcement learning related to the temporal difference method. A major characteristic of the proposed system is its ability to use past experience with an incompletely known system to predict its future behavior. The proposed method is applied in the context of a cart-pole balancing problem. The present approach learns to balance a pole within 15 trials (within 10 trials in most cases) and outperforms the previously developed schemes for this problem such as A.G. Barto et al.'s (1983) method or D. Michie and R.A. Chambers' (1968) work in the BOXES system.

This publication has 7 references indexed in Scilit: