agent RL - a ReneeA1 Collection

ReneeA1 's Collections

agent RL

updated Sep 16

Tool-integrated Reinforcement Learning for Repo Deep Search

Paper • 2508.03012 • Published Aug 5 • 20
Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5 • 107
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents

Paper • 2509.09265 • Published Sep 11 • 45
A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 186
WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

Paper • 2509.06501 • Published Sep 8 • 78
Reinforcement Learning Foundations for Deep Research Systems: A Survey

Paper • 2509.06733 • Published Sep 8 • 32
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published Sep 3 • 21
DCPO: Dynamic Clipping Policy Optimization

Paper • 2509.02333 • Published Sep 2 • 21
PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

Paper • 2508.21104 • Published Aug 28 • 35