Dopamine Cells Respond to Predicted Events during Classical Conditioning: Evidence for Eligibility Traces in the Reward-Learning Network
Top Cited Papers
Open Access
- 29 June 2005
- journal article
- Published by Society for Neuroscience in Journal of Neuroscience
- Vol. 25 (26) , 6235-6242
- https://doi.org/10.1523/jneurosci.1478-05.2005
Abstract
Behavioral conditioning of cue-reward pairing results in a shift of midbrain dopamine (DA) cell activity from responding to the reward to responding to the predictive cue. However, the precise time course and mechanism underlying this shift remain unclear. Here, we report a combined single-unit recording and temporal difference (TD) modeling approach to this question. The data from recordings in conscious rats showed that DA cells retain responses to predicted reward after responses to conditioned cues have developed, at least early in training. This contrasts with previous TD models that predict a gradual stepwise shift in latency with responses to rewards lost before responses develop to the conditioned cue. By exploring the TD parameter space, we demonstrate that the persistent reward responses of DA cells during conditioning are only accurately replicated by a TD model with long-lasting eligibility traces (nonzero values for the parameter λ) and low learning rate (α). These physiological constraints for TD parameters suggest that eligibility traces and low per-trial rates of plastic modification may be essential features of neural circuits for reward learning in the brain. Such properties enable rapid but stable initiation of learning when the number of stimulus-reward pairings is limited, conferring significant adaptive advantages in real-world environments.Keywords
This publication has 38 references indexed in Scilit:
- A Possible Role of Midbrain Dopamine Neurons in Short- and Long-Term Adaptation of Saccades to Position-Reward MappingJournal of Neurophysiology, 2004
- Coincident but Distinct Messages of Midbrain Dopamine and Striatal Tonically Active NeuronsPublished by Elsevier ,2004
- Temporal difference models describe higher-order learning in humansNature, 2004
- Dissociable Roles of Ventral and Dorsal Striatum in Instrumental ConditioningScience, 2004
- Discrete Coding of Reward Probability and Uncertainty by Dopamine NeuronsScience, 2003
- Spike-Timing-Dependent Hebbian Plasticity as Temporal Difference LearningNeural Computation, 2001
- A Neural Substrate of Prediction and RewardScience, 1997
- Simulation of anticipatory responses in classical conditioning by a neuron-like adaptive elementBehavioural Brain Research, 1982
- Mesencephalic dopaminergic unit activity in the behaviorally conditioned ratLife Sciences, 1981
- Nigral Dopamine Neurons: Intracellular Recording and Identification with L-Dopa Injection and HistofluorescenceScience, 1980