Forgetting in Reinforcement Learning Links Sustained Dopamine Signals to Motivation
- 日時
- 2020年5月27日(水)10:00 - 10:45 (JST)
- 講演者
-
- 加藤 郁佳 (理化学研究所 脳神経科学研究センター (CBS) / 東京大学 博士課程)
- 会場
- via Zoom
- 言語
- 英語
Dopamine (DA) has been suggested to have two reward-related roles: (1) representing reward-prediction-error (RPE), and (2) providing motivational drive. Role(1) is based on the physiological results that DA responds to unpredicted but not predicted reward, whereas role(2) is supported by the pharmacological results that blockade of DA signaling causes motivational impairments such as slowdown of self-paced behavior. Whereas synaptic/circuit mechanisms for role(1), i.e., how RPE is calculated in the upstream of DA neurons and how RPE-dependent update of learned-values occurs through DA-dependent synaptic plasticity, have now become clarified, mechanisms for role(2) remain unclear. We modeled self-paced behavior by a series of ‘Go’ or ‘No-Go’ selections in the framework of reinforcement-learning assuming DA's role(1), and demonstrated that incorporation of decay/forgetting of learned-values, which is presumably implemented as decay of synaptic strengths storing learned-values, provides a potential unified mechanistic account for the DA's two roles, together with its various temporal patterns.