All Relations between reward and rl

Publication Sentence Publish Date Extraction Date Species
Darrell A Worthy, W Todd Maddo. Age-based differences in strategy use in choice tasks. Frontiers in neuroscience. vol 5. 2012-10-02. PMID:22232573. we compared the fits of a model that assumes the use of a win-stay-lose-shift (wsls) heuristic to make decisions, to the fits of a reinforcement-learning (rl) model that compared expected reward values for each option to make decisions. 2012-10-02 2023-08-12 Not clear
Norikazu Sugimoto, Masahiko Haruno, Kenji Doya, Mitsuo Kawat. MOSAIC for multiple-reward environments. Neural computation. vol 24. issue 3. 2012-05-24. PMID:22168558. to achieve high performance, rl controllers must consider the complex external dynamics for movements and task (reward function) and optimize control commands. 2012-05-24 2023-08-12 Not clear
Norikazu Sugimoto, Masahiko Haruno, Kenji Doya, Mitsuo Kawat. MOSAIC for multiple-reward environments. Neural computation. vol 24. issue 3. 2012-05-24. PMID:22168558. in this article, we address how an rl agent should be designed to handle such double complexity of dynamics and reward. 2012-05-24 2023-08-12 Not clear
Norikazu Sugimoto, Masahiko Haruno, Kenji Doya, Mitsuo Kawat. MOSAIC for multiple-reward environments. Neural computation. vol 24. issue 3. 2012-05-24. PMID:22168558. it resembles mosaic in spirit and selects and learns an appropriate rl controller based on the rl controller's td error using the errors of the dynamics (the forward model) and the reward predictors. 2012-05-24 2023-08-12 Not clear
Norikazu Sugimoto, Masahiko Haruno, Kenji Doya, Mitsuo Kawat. MOSAIC for multiple-reward environments. Neural computation. vol 24. issue 3. 2012-05-24. PMID:22168558. the simulation results demonstrate that mosaic-mr outperforms other counterparts because of this flexible association ability among rl controllers, forward models, and reward predictors. 2012-05-24 2023-08-12 Not clear
Marios G Philiastides, Guido Biele, Niki Vavatzanidis, Philipp Kazzer, Hauke R Heekere. Temporal dynamics of prediction error processing during reward-based decision making. NeuroImage. vol 53. issue 1. 2011-01-04. PMID:20510376. these representations can be acquired with reinforcement learning (rl) mechanisms, which use the prediction error (pe, the difference between expected and received rewards) as a learning signal to update reward expectations. 2011-01-04 2023-08-12 Not clear
Marios G Philiastides, Guido Biele, Niki Vavatzanidis, Philipp Kazzer, Hauke R Heekere. Temporal dynamics of prediction error processing during reward-based decision making. NeuroImage. vol 53. issue 1. 2011-01-04. PMID:20510376. importantly, our single-trial eeg analysis based on pes from an rl model showed that the feedback-related potentials do not merely reflect error awareness, but rather quantitative information crucial for learning reward contingencies. 2011-01-04 2023-08-12 Not clear
Mattia Rigotti, Daniel Ben Dayan Rubin, Sara E Morrison, C Daniel Salzman, Stefano Fus. Attractor concretion as a mechanism for the formation of context representations. NeuroImage. vol 52. issue 3. 2010-10-27. PMID:20100580. given a set of mental states, reinforcement learning (rl) algorithms predict the optimal policy that maximizes future reward. 2010-10-27 2023-08-12 Not clear
Shesharao M Wanjerkhede, Raju S Bap. Modeling the sub-cellular signaling pathways involved in reinforcement learning at the striatum. Progress in brain research. vol 168. 2008-06-20. PMID:18166396. the process of learning actions by reward or punishment is called 'instrumental conditioning' or 'reinforcement learning' (rl). 2008-06-20 2023-08-12 Not clear