Publication |
Sentence |
Publish Date |
Extraction Date |
Species |
Pierre Schumacher, Thomas Geijtenbeek, Vittorio Caggiano, Vikash Kumar, Syn Schmitt, Georg Martius, Daniel F B Haeufl. Emergence of natural and robust bipedal walking by learning from biologically plausible objectives. iScience. vol 28. issue 4. 2025-04-17. PMID:40241757. |
we demonstrate that the combination of a recent rl algorithm with a biologically plausible reward is capable of learning controllers for 4 different musculoskeletal models and achieves locomotion with up to 90 muscles without demonstrations. |
2025-04-17 |
2025-04-19 |
human |
Mi Li, Xiaolong Pan, Chuhui Liu, Zirui L. Federated deep reinforcement learning-based urban traffic signal optimal control. Scientific reports. vol 15. issue 1. 2025-04-05. PMID:40188158. |
by reasonably designing the state, action and reward functions and determining the optimal values of several key parameters in the federated collaboration mechanism, the rl model could ensure high learning efficiency and fast convergence in the face of the gradual increase of road network size and the exponential increase of state and action space with the number of intersections. |
2025-04-05 |
2025-04-08 |
Not clear |
Xin Liu, Yaran Chen, Guixing Chen, Haoran Li, Dongbin Zha. Balancing State Exploration and Skill Diversity in Unsupervised Skill Discovery. IEEE transactions on cybernetics. vol PP. 2025-03-28. PMID:40138236. |
unsupervised skill discovery seeks to acquire different useful skills without extrinsic reward via unsupervised reinforcement learning (rl), with the discovered skills efficiently adapting to multiple downstream tasks in various ways. |
2025-03-28 |
2025-03-30 |
Not clear |
Ahmed H Yakout, Ahmed E B Abu-Elanien, Hany M Hasanie. Rotor angle stability enhancement using DDPG reinforcement agent with Gorilla troops optimized input scaling factors. Scientific reports. vol 15. issue 1. 2025-03-24. PMID:40122894. |
furthermore, the rl reward considered is a discrete function that rewards the generators' accelerating power samples when they are below a defined value. |
2025-03-24 |
2025-03-26 |
Not clear |
Tingxuan Chen, Liu Yang, Zidong Wang, Jun Lon. A rule- and query-guided reinforcement learning for extrapolation reasoning in temporal knowledge graphs. Neural networks : the official journal of the International Neural Network Society. vol 185. 2025-03-08. PMID:40055886. |
specifically, logirl innovatively designs a temporal logic rule-guided reward mechanism, steering rl agents toward actions that are consistent with established rules, thereby fostering the generation of explainable and logical reasoning paths. |
2025-03-08 |
2025-03-12 |
Not clear |
Christian E Waugh, Adam P Porth, Xuanyu Fang, L Paul Sands, Kenneth T Kishid. What do we actually want to experience? A computational metric for assessing reward values. Research square. 2025-03-04. PMID:40034432. |
in two studies, we demonstrate that using a combination of reinforcement learning (rl) paradigms and computational modeling, we can measure computationally inferred reward values (crv) of experiences, which do not rely on conscious self-report. |
2025-03-04 |
2025-03-06 |
human |
Georgia Turner, Amanda M Ferguson, Tanay Katiyar, Stefano Palminteri, Amy Orbe. Old Strategies, New Environments: Reinforcement Learning on Social Media. Biological psychiatry. 2025-03-03. PMID:39725300. |
rl, which has proven to be successful in characterizing human social behavior, consists of 3 stages: updating expected reward, valuating expected reward by integrating subjective costs such as effort, and selecting an action. |
2025-03-03 |
2025-03-05 |
human |
Georgia Turner, Amanda M Ferguson, Tanay Katiyar, Stefano Palminteri, Amy Orbe. Old Strategies, New Environments: Reinforcement Learning on Social Media. Biological psychiatry. 2025-03-03. PMID:39725300. |
the rl framework describes a process by which an agent can learn to maximize their long-term reward. |
2025-03-03 |
2025-03-05 |
human |
Yuji Cao, Huan Zhao, Yuheng Cheng, Ting Shu, Yue Chen, Guolong Liu, Gaoqi Liang, Junhua Zhao, Jinyue Yan, Yun L. Survey on Large Language Model-Enhanced Reinforcement Learning: Concept, Taxonomy, and Methods. IEEE transactions on neural networks and learning systems. vol PP. 2025-03-03. PMID:40030358. |
utilizing the classical agent-environment interaction paradigm, we propose a structured taxonomy to systematically categorize llms' functionalities in rl, including four roles: information processor, reward designer, decision-maker, and generator. |
2025-03-03 |
2025-03-14 |
Not clear |
Sesun You, Kwankyun Byeon, Jiwon Seo, Wonhee Kim, Masayoshi Tomizuk. Policy-Iteration-Based Active Disturbance Rejection Control for Uncertain Nonlinear Systems With Unknown Relative Degree. IEEE transactions on cybernetics. vol PP. 2025-03-03. PMID:40031680. |
the rl agent is designed to minimize the quadratic reward as the performance index function while enhancing the influence of the partial control input associated with the correct relative degree via the policy iteration procedure. |
2025-03-03 |
2025-03-06 |
Not clear |
Ethan M Campbell, Wanting Zhong, Jeremy Hogeveen, Jordan Grafma. Dorsal-Ventral Reinforcement Learning Network Connectivity and Incentive-Driven Changes in Random Exploration. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2025-02-27. PMID:40015985. |
these findings suggest that integrity of functional connections between stimulus valuation (ventral) and action valuation (dorsal) rl networks is associated with changes in the balance between explore-exploit decisions under changing reward incentives. |
2025-02-27 |
2025-03-02 |
Not clear |
Yu Sun Chung, Berry van den Berg, Kenneth C Roberts, Armen Bagdasarov, Marty G Woldorff, Michael S Gaffre. Electrical brain activations in preadolescents during a probabilistic reward-learning task reflect cognitive processes and behavior strategies. Frontiers in human neuroscience. vol 19. 2025-02-14. PMID:39949988. |
both adults and children learn through feedback to associate environmental events and choices with reward, a process known as reinforcement learning (rl). |
2025-02-14 |
2025-02-16 |
Not clear |
Hiroshi Makino, Ahmad Suhaim. Distributed representations of temporally accumulated reward prediction errors in the mouse cortex. Science advances. vol 11. issue 4. 2025-01-22. PMID:39841828. |
rpe representations in mice aligned with theoretical predictions of rl, emerging during learning and being subject to manipulations of the reward function. |
2025-01-22 |
2025-01-25 |
mouse |
Juliana E Trach, Megan T deBettencourt, Angela Radulescu, Samuel D McDougl. Rewards transiently and automatically enhance sustained attention. Journal of experimental psychology. General. 2025-01-21. PMID:39836118. |
here, we investigated the influence of reward feedback on attentional vigilance during a simultaneous sustained attention and reinforcement learning (rl) task. |
2025-01-21 |
2025-01-23 |
Not clear |
Chiara Montemitro, Paolo Ossola, Thomas J Ross, Quentin J M Huys, John R Fedota, Betty Jo Salmeron, Massimo di Giannantonio, Elliot A Stei. Longitudinal changes in reinforcement learning during smoking cessation: a computational analysis using a probabilistic reward task. Scientific reports. vol 14. issue 1. 2024-12-31. PMID:39741189. |
in a longitudinal, within-subject design, we used a probabilistic reward task (prt) to assess rl in twenty smokers who successfully refrained from smoking for at least 30 days. |
2024-12-31 |
2025-01-03 |
Not clear |
Chiara Montemitro, Paolo Ossola, Thomas J Ross, Quentin J M Huys, John R Fedota, Betty Jo Salmeron, Massimo di Giannantonio, Elliot A Stei. Longitudinal changes in reinforcement learning during smoking cessation: a computational analysis using a probabilistic reward task. Scientific reports. vol 14. issue 1. 2024-12-31. PMID:39741189. |
while it is plausible that some changes in task performance could be attributed to task repetition effects, we observed a clear impact of the nicotine withdrawal syndrome (nws) on rl, and a dynamic relationship between craving and reward and punishment sensitivity over time, suggesting a significant recalibration of cognitive processes during abstinence. |
2024-12-31 |
2025-01-03 |
Not clear |
Luca Giamattei, Matteo Biagiola, Roberto Pietrantuono, Stefano Russo, Paolo Tonell. Reinforcement learning for online testing of autonomous driving systems: a replication and extension study. Empirical software engineering. vol 30. issue 1. 2024-11-14. PMID:39513128. |
our extension aims at eliminating some of the possible reasons for the poor performance of rl observed in our replication: (1) the presence of reward components providing contrasting feedback to the rl agent; (2) the usage of an rl algorithm (q-learning) which requires discretization of an intrinsically continuous state space. |
2024-11-14 |
2024-11-16 |
Not clear |
Tamás Bécs. RRT-guided experience generation for reinforcement learning in autonomous lane keeping. Scientific reports. vol 14. issue 1. 2024-10-14. PMID:39402145. |
the study centers on the lane-keeping problem of a dynamic vehicle model handled by rl, examining a scenario where reward shaping is omitted, leading to sparse rewards. |
2024-10-14 |
2024-10-17 |
Not clear |
Ajnabiul Hoque, Mihir Surve, Shivaram Kalyanakrishnan, Raghavan B Suno. Reinforcement Learning for Improving Chemical Reaction Performance. Journal of the American Chemical Society. 2024-10-02. PMID:39356950. |
our engineered reward function includes a tanimoto-based uniqueness factor within the rl loop that improved the exploration of the environment and has helped accrue larger returns. |
2024-10-02 |
2024-10-05 |
Not clear |
Xingche Guo, Donglin Zeng, Yuanjia Wan. HMM for discovering decision-making dynamics using reinforcement learning experiments. Biostatistics (Oxford, England). 2024-09-03. PMID:39226534. |
reinforcement learning (rl) models are fitted to extract parameters that measure various aspects of reward processing (e.g. |
2024-09-03 |
2024-09-06 |
human |