Mathematical principles of reinforcement

1 March 1994

journal article
research article
Published by Cambridge University Press (CUP) in Behavioral and Brain Sciences

Vol. 17 (1) , 105-135
https://doi.org/10.1017/s0140525x00033628

Abstract

Effective conditioning requires a correlation between the experimenter's definition of a response and an organism's, but an animal's perception of its behavior differs from ours. These experiments explore various definitions of the response, using the slopes of learning curves to infer which comes closest to the organism's definition. The resulting exponentially weighted moving average provides a model of memory that is used to ground a quantitative theory of reinforcement. The theory assumes that: incentives excite behavior and focus the excitement on responses that are contemporaneous in memory. The correlation between the organism's memory and the behavior measured by the experimenter is given by coupling coefficients, which are derived for various schedules of reinforcement. The coupling coefficients for simple schedules may be concatenated to predict the effects of complex schedules. The coefficients are inserted into a generic model of arousal and temporal constraint to predict response rates under any scheduling arrangement. The theory posits a response-indexed decay of memory, not a time-indexed one. It requires that incentives displace memory for the responses that occur before them, and may truncate the representation of the response that brings them about. As a contiguity-weighted correlation model, it bridges opposing views of the reinforcement process. By placing the short-term memory of behavior in so central a role, it provides a behavioral account of a key cognitive process.

Keywords

This publication has 204 references indexed in Scilit:

Frequent reward eliminates differences in activity between hyperkinetic rats and controls
Behavioral and Neural Biology, 1993
Choosing to Vary and Repeat
Psychological Science, 1992
What connectionist models learn: Learning and representation in connectionist networks
Behavioral and Brain Sciences, 1990
Prediction and Theory Evaluation: The Case of Light Bending
Science, 1989
Two modes of learning for interactive tasks
Cognition, 1988
Memory and the efficient use of information
Journal of Theoretical Biology, 1987
Pavlovian conditioned stimulus effects upon instrumental choice behavior are reinforcer specific
Learning and Motivation, 1983
Short-term memory in the pigeon with presentation time precisely controlled
Learning and Motivation, 1974
The genetical evolution of social behaviour. I
Journal of Theoretical Biology, 1964
The temporal distribution of avoidance responses.
Journal of Comparative and Physiological Psychology, 1954