All Relations between reward and rl

Publication	Sentence	Publish Date	Extraction Date	Species
Maryam Zare, Parham M Kebria, Abbas Khosravi, Saeid Nahavand. A Survey of Imitation Learning: Algorithms, Recent Developments, and Challenges. IEEE transactions on cybernetics. vol PP. 2024-07-18. PMID:39024072.	as a consequence, programming their behaviors manually or defining their behavior through the reward functions as done in reinforcement learning (rl) has become exceedingly difficult.	2024-07-18	2024-07-21	Not clear
Zishun Yu, Siteng Kang, Xinhua Zhan. Offline Reward Perturbation Boosts Distributional Shift in Online RL. Uncertainty in artificial intelligence : proceedings of the ... conference. Conference on Uncertainty in Artificial Intelligence. vol 2024. 2024-07-16. PMID:39006853.	offline reward perturbation boosts distributional shift in online rl.	2024-07-16	2024-07-18	Not clear
Elena Maria Tosca, Alessandro De Carlo, Davide Ronchi, Paolo Magn. Model-Informed Reinforcement Learning for Enabling Precision Dosing Via Adaptive Dosing. Clinical pharmacology and therapeutics. 2024-07-11. PMID:38989560.	in addition, a tutorial on how a precision dosing problem should be formulated in terms of the key elements composing the rl framework (i.e., system state, agent actions and reward function), and on how pk/pd models could enhance rl approaches is proposed for readers interested in delving in this field.	2024-07-11	2024-07-13	Not clear
Harry Robertshaw, Lennart Karstensen, Benjamin Jackson, Alejandro Granados, Thomas C Boot. Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning. International journal of computer assisted radiology and surgery. 2024-06-17. PMID:38884893.	reinforcement learning (rl) shows potential in endovascular navigation, yet its application encounters challenges without a reward signal.	2024-06-17	2024-06-19	Not clear
Aditya Koneru, Adil Muhammed, Karthik Balasubramanian, Kiran Sasikumar, Sukriti Manna, Henry Chan, Subramanian K R S Sankaranarayana. Ab Initio-Based Bond Order Potential for Arsenene Polymorphs Developed via Hierarchical Reinforcement Learning. The journal of physical chemistry. A. 2024-06-14. PMID:38872347.	our rl strategy utilizes decision trees coupled with a hierarchical reward strategy to accelerate convergence in high-dimensional continuous search spaces.	2024-06-14	2024-06-16	Not clear
Outongyi Lv, Bingxin Zhou, Lin F Yan. Modeling Bellman-error with logistic distribution with applications in reinforcement learning. Neural networks : the official journal of the International Neural Network Society. vol 177. 2024-05-24. PMID:38788292.	this study also offers a novel theoretical contribution by establishing a clear connection between the distribution of bellman error and the practice of proportional reward scaling, a common technique for performance enhancement in rl.	2024-05-24	2024-05-27	Not clear
Daniel G Dillon, Emily L Belleau, Julianne Origlio, Madison McKee, Aava Jahan, Ashley Meyer, Min Kang Souther, Devon Brunner, Manuel Kuhn, Yuen Siang Ang, Cristina Cusin, Maurizio Fava, Diego A Pizzagall. Using Drift Diffusion and RL Models to Disentangle Effects of Depression On Decision-Making vs. Learning in the Probabilistic Reward Task. Computational psychiatry (Cambridge, Mass.). vol 8. issue 1. 2024-05-23. PMID:38774430.	using drift diffusion and rl models to disentangle effects of depression on decision-making vs. learning in the probabilistic reward task.	2024-05-23	2024-05-27	Not clear
Daniel G Dillon, Emily L Belleau, Julianne Origlio, Madison McKee, Aava Jahan, Ashley Meyer, Min Kang Souther, Devon Brunner, Manuel Kuhn, Yuen Siang Ang, Cristina Cusin, Maurizio Fava, Diego A Pizzagall. Using Drift Diffusion and RL Models to Disentangle Effects of Depression On Decision-Making vs. Learning in the Probabilistic Reward Task. Computational psychiatry (Cambridge, Mass.). vol 8. issue 1. 2024-05-23. PMID:38774430.	the probabilistic reward task (prt) is widely used to investigate the impact of major depressive disorder (mdd) on reinforcement learning (rl), and recent studies have used it to provide insight into decision-making mechanisms affected by mdd.	2024-05-23	2024-05-27	Not clear
Wolfram Schult. A dopamine mechanism for reward maximization. Proceedings of the National Academy of Sciences of the United States of America. vol 121. issue 20. 2024-05-08. PMID:38717856.	reinforcement learning (rl) formalisms use predictions, actions, and policies to maximize reward.	2024-05-08	2024-05-27	monkey
Wolfram Schult. A dopamine mechanism for reward maximization. Proceedings of the National Academy of Sciences of the United States of America. vol 121. issue 20. 2024-05-08. PMID:38717856.	midbrain dopamine neurons code reward prediction errors (rpe) of subjective reward value suitable for rl.	2024-05-08	2024-05-27	monkey
Wolfram Schult. A dopamine mechanism for reward maximization. Proceedings of the National Academy of Sciences of the United States of America. vol 121. issue 20. 2024-05-08. PMID:38717856.	dopamine excitations reflect positive rpes that increase reward predictions via rl; against increasing predictions, obtaining similar dopamine rpe signals again requires better rewards than before.	2024-05-08	2024-05-27	monkey
Xingche Guo, Donglin Zeng, Yuanjia Wan. A Semiparametric Inverse Reinforcement Learning Approach to Characterize Decision Making for Mental Disorders. Journal of the American Statistical Association. vol 119. issue 545. 2024-05-06. PMID:38706706.	motivated by the probabilistic reward task (prt) experiment in the embarc study, we propose a semiparametric inverse reinforcement learning (rl) approach to characterize the reward-based decision-making of mdd patients.	2024-05-06	2024-05-08	human
Le Huu Binh, Thuy-Van T Duon. A novel and effective method for solving the router nodes placement in wireless mesh networks using reinforcement learning. PloS one. vol 19. issue 4. 2024-04-10. PMID:38598499.	the rnp problem is modeled as an rl model with environment, agent, action, and reward are equivalent to the network system, routers, coordinate adjustment, and connectivity of the rnp problem, respectively.	2024-04-10	2024-04-13	Not clear
Alexander Kensert, Pieter Libin, Gert Desmet, Deirdre Caboote. Deep reinforcement learning for the direct optimization of gradient separations in liquid chromatography. Journal of chromatography. A. vol 1720. 2024-03-05. PMID:38442496.	this paper therefore aims to introduce rl, specifically proximal policy optimization (ppo), in liquid chromatography, and evaluate whether it can be trained to optimize separations directly, based solely on the outcome of a single generic separation as input, and a reward signal based on the resolution between peak pairs (taking a value between [-1,1]).	2024-03-05	2024-03-08	Not clear
Thang M Le, Takeyuki Oba, Luke Couch, Lauren McInerney, Chiang-Shan R L. The neural correlates of individual differences in reinforcement learning during pain avoidance and reward seeking. eNeuro. 2024-02-16. PMID:38365840.	reinforcement learning (rl) offers a critical framework to understand individual differences in this associative learning by assessing learning rate, action bias, pavlovian factor (i.e., the extent to which action values are influenced by stimulus values), and subjective impact of outcomes (i.e., motivation to seek reward and avoid punishment).	2024-02-16	2024-02-19	human
Xuan-Kun Li, Jian-Xu Ma, Xiang-Yu Li, Jun-Jie Hu, Chuan-Yang Ding, Feng-Kai Han, Xiao-Min Guo, Xi Tan, Xian-Min Ji. High-efficiency reinforcement learning with hybrid architecture photonic integrated circuit. Nature communications. vol 15. issue 1. 2024-02-05. PMID:38316815.	by introducing similarity information into the reward function of the rl model, pic-rl successfully accomplishes perovskite materials synthesis task within a 3472-dimensional state space, resulting in a notable 56% improvement in efficiency.	2024-02-05	2024-02-09	Not clear
Jyun-Wei Li, Yu-Chieh Teng, Shinji Nimura, Yibeltal Chanie Manie, Kamya Yekeh Yazdandoost, Kazuki Tanaka, Ryo Inohara, Takehiro Tsuritani, Peng-Chun Pen. Reinforcement learning-based adaptive beam alignment in a photodiode-integrated array antenna module. Optics letters. vol 49. issue 3. 2024-02-01. PMID:38300085.	in our proposed scheme, the three key elements of rl: state, action, and reward, are represented as the phase values in the photonic array antenna, phase changes with specified steps, and an obtained error vector magnitude (evm) value, respectively.	2024-02-01	2024-02-03	Not clear
Zeyong Wei, Honghua Chen, Liangliang Nan, Jun Wang, Jing Qin, Mingqiang We. PathNet: Path-Selective Point Cloud Denoising. IEEE transactions on pattern analysis and machine intelligence. vol PP. 2024-01-19. PMID:38241116.	first, to leverage geometry expertise and benefit from training data, we propose a noise- and geometry-aware reward function to train the routing agent in rl.	2024-01-19	2024-01-22	Not clear
Antoine Théberge, Christian Desrosiers, Arnaud Boré, Maxime Descoteaux, Pierre-Marc Jodoi. What matters in reinforcement learning for tractography. Medical image analysis. vol 93. 2024-01-14. PMID:38219499.	in this work, we thoroughly explore the different components of the proposed framework, such as the choice of the rl algorithm, seeding strategy, the input signal and reward function, and shed light on their impact.	2024-01-14	2024-01-17	Not clear
Antoine Théberge, Christian Desrosiers, Arnaud Boré, Maxime Descoteaux, Pierre-Marc Jodoi. What matters in reinforcement learning for tractography. Medical image analysis. vol 93. 2024-01-14. PMID:38219499.	as such, we ultimately propose a series of recommendations concerning the choice of rl algorithm, the input to the agents, the reward function and more to help future work using reinforcement learning for tractography.	2024-01-14	2024-01-17	Not clear

1 2 3 4 ... 9