Human reinforcement learning subdivides structured action spaces by learning effector-specific values.

PubWeight™: 1.66‹?› | Rank: Top 3%

🔗 View Article (PMC 2796632)

Published in J Neurosci on October 28, 2009

Authors

Samuel J Gershman1, Bijan Pesaran, Nathaniel D Daw

Author Affiliations

1: Center for Neural Science, New York University, New York, New York 10003, USA. sjgershm@princeton.edu

Articles citing this

States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron (2010) 5.30

Model-based influences on humans' choices and striatal prediction errors. Neuron (2011) 4.58

Mechanisms underlying cortical activity during value-guided choice. Nat Neurosci (2012) 2.16

A reinforcement learning mechanism responsible for the valuation of free choice. Neuron (2014) 1.64

Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis. Cereb Cortex (2011) 1.49

Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron (2012) 1.46

Learning latent structure: carving nature at its joints. Curr Opin Neurobiol (2010) 1.45

Neural mechanisms underlying motivation of mental versus physical effort. PLoS Biol (2012) 1.39

Linking Individual Learning Styles to Approach-Avoidance Motivational Traits and Computational Aspects of Reinforcement Learning. PLoS One (2016) 1.39

Mechanisms of hierarchical reinforcement learning in cortico-striatal circuits 2: evidence from fMRI. Cereb Cortex (2011) 1.33

A mechanism for value-guided choice based on the excitation-inhibition balance in prefrontal cortex. Nat Neurosci (2012) 1.21

Dissociating hippocampal and striatal contributions to sequential prediction learning. Eur J Neurosci (2012) 1.19

Ventral striatal prediction error signaling is associated with dopamine synthesis capacity and fluid intelligence. Hum Brain Mapp (2012) 1.11

Hierarchical competitions subserving multi-attribute choice. Nat Neurosci (2014) 1.10

Generalization of value in reinforcement learning by humans. Eur J Neurosci (2012) 1.07

Dissociable reward and timing signals in human midbrain and ventral striatum. Neuron (2011) 1.06

Action-specific value signals in reward-related regions of the human brain. J Neurosci (2012) 1.02

Trial-type dependent frames of reference for value comparison. PLoS Comput Biol (2013) 0.96

Cortical and hippocampal correlates of deliberation during model-based decisions for rewards in humans. PLoS Comput Biol (2013) 0.95

Effect of reinforcement history on hand choice in an unconstrained reaching task. Front Neurosci (2011) 0.92

Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia. J Neurosci (2013) 0.91

Dopamine modulates the neural representation of subjective value of food in hungry subjects. J Neurosci (2014) 0.89

Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target. Nat Neurosci (2016) 0.89

Reinforcement learning models and their neural correlates: An activation likelihood estimation meta-analysis. Cogn Affect Behav Neurosci (2015) 0.88

Episodic memory encoding interferes with reward learning and decreases striatal prediction errors. J Neurosci (2014) 0.83

Credit assignment in movement-dependent reinforcement learning. Proc Natl Acad Sci U S A (2016) 0.82

Motor preparatory activity in posterior parietal cortex is modulated by subjective absolute value. PLoS Biol (2010) 0.81

The effects of life stress and neural learning signals on fluid intelligence. Eur Arch Psychiatry Clin Neurosci (2014) 0.81

Learning to represent reward structure: a key to adapting to complex environments. Neurosci Res (2012) 0.80

Many hats: intratrial and reward level-dependent BOLD activity in the striatum and premotor cortex. J Neurophysiol (2013) 0.80

Novelty and Inductive Generalization in Human Reinforcement Learning. Top Cogn Sci (2015) 0.79

Chronic alcohol intake abolishes the relationship between dopamine synthesis capacity and learning signals in the ventral striatum. Eur J Neurosci (2014) 0.78

Bursts and heavy tails in temporal and sequential dynamics of foraging decisions. PLoS Comput Biol (2014) 0.77

Action selection in multi-effector decision making. Neuroimage (2012) 0.77

Effort, success, and nonuse determine arm choice. J Neurophysiol (2015) 0.77

Modular inverse reinforcement learning for visuomotor behavior. Biol Cybern (2013) 0.76

Unilateral medial frontal cortex lesions cause a cognitive decision-making deficit in rats. Eur J Neurosci (2014) 0.76

Will big data yield new mathematics? An evolving synergy with neuroscience. IMA J Appl Math (2016) 0.75

Inertia and Decision Making. Front Psychol (2016) 0.75

Articles cited by this

The Psychophysics Toolbox. Spat Vis (1997) 62.63

A neural substrate of prediction and reward. Science (1997) 31.30

Predictive reward signal of dopamine neurons. J Neurophysiol (1998) 15.02

Circular analysis in systems neuroscience: the dangers of double dipping. Nat Neurosci (2009) 12.11

Cortical substrates for exploratory decisions in humans. Nature (2006) 10.73

Neural correlates of decision variables in parietal cortex. Nature (1999) 10.33

Matching behavior and the representation of value in the parietal cortex. Science (2004) 9.83

Temporal difference models and reward-related learning in the human brain. Neuron (2003) 8.50

Learning the value of information in an uncertain world. Nat Neurosci (2007) 8.31

Reward representations and reward-related learning in the human brain: insights from neuroimaging. Curr Opin Neurobiol (2004) 8.12

Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron (2005) 7.56

Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans. Nature (2006) 6.90

Temporal prediction errors in a passive learning task activate human striatum. Neuron (2003) 6.44

Striatonigrostriatal pathways in primates form an ascending spiral from the shell to the dorsolateral striatum. J Neurosci (2000) 6.02

Representation of action-specific reward values in the striatum. Science (2005) 5.76

Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. J Neurosci (2008) 5.64

Cognitive control signals for neural prosthetics. Science (2004) 5.37

Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. J Neurosci (2007) 5.04

Coding of intention in the posterior parietal cortex. Nature (1997) 4.79

The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J Neurosci (2006) 4.55

Phasic excitation of dopamine neurons in ventral VTA by noxious stimuli. Proc Natl Acad Sci U S A (2009) 4.15

Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat Neurosci (2007) 3.96

How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron (2009) 3.85

Dynamic response-by-response models of matching behavior in rhesus monkeys. J Exp Anal Behav (2005) 3.83

Reward-related responses in the human striatum. Ann N Y Acad Sci (2007) 3.72

Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput (2006) 3.59

Reach plans in eye-centered coordinates. Science (1999) 3.40

The computational neurobiology of learning and reward. Curr Opin Neurobiol (2006) 3.29

The functional organization of the intraparietal sulcus in humans and monkeys. J Anat (2005) 3.24

Functional organization of human intraparietal and frontal cortex for attending, looking, and pointing. J Neurosci (2003) 3.17

Posterior parietal cortex encodes autonomously selected motor plans. Neuron (2007) 3.13

Value representations in the primate striatum during matching behavior. Neuron (2008) 3.02

Activity in human ventral striatum locked to errors of reward prediction. Nat Neurosci (2002) 2.92

Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making. J Neurosci (2007) 2.83

Bayesian theories of conditioning in a changing world. Trends Cogn Sci (2006) 2.81

Free choice activates a decision circuit between frontal and parietal cortex. Nature (2008) 2.75

Multiple model-based reinforcement learning. Neural Comput (2002) 2.74

A common reference frame for movement plans in the posterior parietal cortex. Nat Rev Neurosci (2002) 2.69

Dorsal premotor neurons encode the relative position of the hand, eye, and goal during reach planning. Neuron (2006) 2.67

Linking nucleus accumbens dopamine and blood oxygenation. Psychopharmacology (Berl) (2007) 2.65

Single nigrostriatal dopaminergic neurons form widely spread and highly dense axonal arborizations in the neostriatum. J Neurosci (2009) 2.44

Determining a role for ventromedial prefrontal cortex in encoding action-based value signals during reward-related decision making. Cereb Cortex (2008) 2.40

Reinforcement learning: the good, the bad and the ugly. Curr Opin Neurobiol (2008) 2.24

FMRI evidence for a 'parietal reach region' in the human brain. Exp Brain Res (2003) 1.94

Structure and strength in causal induction. Cogn Psychol (2005) 1.79

Functional magnetic resonance imaging of macaque monkeys performing visually guided saccade tasks: comparison of cortical eye fields with humans. Neuron (2004) 1.73

Modular decomposition in visuomotor learning. Nature (1997) 1.72

Visual and anticipatory bias in three cortical eye fields of the monkey during an adaptive decision-making task. J Neurosci (2002) 1.71

The discovery of structural form. Proc Natl Acad Sci U S A (2008) 1.71

Behavioral and neural changes after gains and losses of conditioned reinforcers. J Neurosci (2009) 1.62

Human medial intraparietal cortex subserves visuomotor coordinate transformation. Neuroimage (2004) 1.45

Representation and timing in theories of the dopamine system. Neural Comput (2006) 1.32

Parietal neurons related to memory-guided hand manipulation. J Neurophysiol (1996) 1.14

Multiple model-based reinforcement learning explains dopamine neuronal activity. Neural Netw (2007) 0.86

Mechanisms of selection and guidance of reaching movements in the parietal lobe. Adv Neurol (2003) 0.80

Articles by these authors

Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology (Berl) (2006) 4.43

The importance of mixed selectivity in complex cognitive tasks. Nature (2013) 3.05

Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making. J Neurosci (2007) 2.83

Bayesian theories of conditioning in a changing world. Trends Cogn Sci (2006) 2.81

Decision theory, reinforcement learning, and the brain. Cogn Affect Behav Neurosci (2008) 2.58

Selecting the signals for a brain-machine interface. Curr Opin Neurobiol (2004) 2.25

A computational substrate for incentive salience. Trends Neurosci (2003) 2.24

The misbehavior of value and the discipline of the will. Neural Netw (2006) 2.17

Striatal activity underlies novelty-based choice in humans. Neuron (2008) 1.92

Neural prosthetic control signals from plan activity. Neuroreport (2003) 1.89

Serotonin and dopamine: unifying affective, activational, and decision functions. Neuropsychopharmacology (2010) 1.87

A dual role for prediction error in associative learning. Cereb Cortex (2008) 1.78

Differential roles of human striatum and amygdala in associative learning. Nat Neurosci (2011) 1.71

The curse of planning: dissecting multiple reinforcement-learning systems by taxing the central executive. Psychol Sci (2013) 1.68

The ubiquity of model-based reinforcement learning. Curr Opin Neurobiol (2012) 1.61

Working-memory capacity protects model-based learning from stress. Proc Natl Acad Sci U S A (2013) 1.55

Serotonin selectively modulates reward value in human decision-making. J Neurosci (2012) 1.49

Neural correlates of forward planning in a spatial decision task in humans. J Neurosci (2011) 1.43

Signals in human striatum are appropriate for policy update rather than value prediction. J Neurosci (2011) 1.29

Selective impairment of prediction error signaling in human dorsolateral but not ventral striatum in Parkinson's disease patients: evidence from a model-based fMRI study. Neuroimage (2009) 1.21

Dissociating hippocampal and striatal contributions to sequential prediction learning. Eur J Neurosci (2012) 1.19

Optimizing the decoding of movement goals from local field potentials in macaque cortex. J Neurosci (2011) 1.14

Surprise! Neural correlates of Pearce-Hall and Rescorla-Wagner coexist within the brain. Eur J Neurosci (2012) 1.12

Only coherent spiking in posterior parietal cortex coordinates looking and reaching. Neuron (2012) 1.08

Neural correlates of visual-spatial attention in electrocorticographic signals in humans. Front Hum Neurosci (2011) 1.07

A method for detection and classification of events in neural activity. IEEE Trans Biomed Eng (2006) 1.07

Generalization of value in reinforcement learning by humans. Eur J Neurosci (2012) 1.07

Anterior prefrontal cortex contributes to action selection through tracking of recent reward trends. J Neurosci (2012) 1.05

Cognitive control predicts use of model-based reinforcement learning. J Cogn Neurosci (2015) 1.05

Choice values. Nat Neurosci (2006) 1.04

Reaction time correlations during eye-hand coordination: behavior and modeling. J Neurosci (2011) 0.99

Dissociable effects of dopamine and serotonin on reversal learning. Neuron (2013) 0.99

Multiplicity of control in the basal ganglia: computational roles of striatal subregions. Curr Opin Neurobiol (2011) 0.97

Dynamic estimation of task-relevant variance in movement under risk. J Neurosci (2012) 0.97

Cortical and hippocampal correlates of deliberation during model-based decisions for rewards in humans. PLoS Comput Biol (2013) 0.95

Decoding covert spatial attention using electrocorticographic (ECoG) signals in humans. Neuroimage (2012) 0.94

Testing whether humans have an accurate model of their own motor uncertainty in a speeded reaching task. PLoS Comput Biol (2013) 0.93

Spike-field activity in parietal area LIP during coordinated reach and saccade movements. J Neurophysiol (2011) 0.90

Extraversion differentiates between model-based and model-free strategies in a reinforcement learning task. Front Hum Neurosci (2013) 0.89

Model-based learning protects against forming habits. Cogn Affect Behav Neurosci (2015) 0.89

Depression: a decision-theoretic analysis. Annu Rev Neurosci (2015) 0.87

Competition for visual selection in the oculomotor system. J Neurosci (2011) 0.87

Area MSTd neurons encode visual stimuli in eye coordinates during fixation and pursuit. J Neurophysiol (2010) 0.85

Dopamine: at the intersection of reward and action. Nat Neurosci (2007) 0.83

Reinforcement learning and higher level cognition: introduction to special issue. Cognition (2009) 0.83

The irrationality of categorical perception. J Neurosci (2013) 0.82

A likelihood method for computing selection times in spiking and local field potential activity. J Neurophysiol (2010) 0.81

Parametric models to relate spike train and LFP dynamics with neural information processing. Front Comput Neurosci (2012) 0.81

Decoding arm and hand movements across layers of the macaque frontal cortices. Conf Proc IEEE Eng Med Biol Soc (2012) 0.81

Translation speed compensation in the dorsal aspect of the medial superior temporal area. J Neurosci (2007) 0.80

Development of a closed-loop feedback system for real-time control of a high-dimensional Brain Machine Interface. Conf Proc IEEE Eng Med Biol Soc (2012) 0.80

A closer look at choice. Nat Neurosci (2010) 0.79

Grid cells, place cells, and geodesic generalization for spatial reinforcement learning. PLoS Comput Biol (2011) 0.77

Action selection in multi-effector decision making. Neuroimage (2012) 0.77

Utilizing movement synergies to improve decoding performance for a brain machine interface. Conf Proc IEEE Eng Med Biol Soc (2013) 0.77

The tracking of reaches in three-dimensions. Conf Proc IEEE Eng Med Biol Soc (2011) 0.75

Computational cognitive neuroscience. Brain Res (2009) 0.75

Multiscale decoding for reliable brain-machine interface performance over time. Conf Proc IEEE Eng Med Biol Soc (2017) 0.75

Development of semi-chronic microdrive system for large-scale circuit mapping in macaque mesolimbic and basal ganglia systems. Conf Proc IEEE Eng Med Biol Soc (2016) 0.75