-
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 141 -
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 136 -
Learning to Reason under Off-Policy Guidance
Paper • 2504.14945 • Published • 88
Collections
Discover the best community collections!
Collections including paper arxiv:2510.04618
-
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 532 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 483 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 262 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 120
-
MacroBench: A Novel Testbed for Web Automation Scripts via Large Language Models
Paper • 2510.04363 • Published -
Control Plane as a Tool: A Scalable Design Pattern for Agentic AI Systems
Paper • 2505.06817 • Published -
Agentic Web: Weaving the Next Web with AI Agents
Paper • 2507.21206 • Published -
Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning
Paper • 2410.02052 • Published • 9
-
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 120 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 483 -
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
Paper • 2502.04644 • Published • 4 -
PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths
Paper • 2502.14902 • Published • 1
-
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 141 -
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 136 -
Learning to Reason under Off-Policy Guidance
Paper • 2504.14945 • Published • 88
-
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 532 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 483 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 262 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 120
-
MacroBench: A Novel Testbed for Web Automation Scripts via Large Language Models
Paper • 2510.04363 • Published -
Control Plane as a Tool: A Scalable Design Pattern for Agentic AI Systems
Paper • 2505.06817 • Published -
Agentic Web: Weaving the Next Web with AI Agents
Paper • 2507.21206 • Published -
Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning
Paper • 2410.02052 • Published • 9
-
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 120 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 483 -
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
Paper • 2502.04644 • Published • 4 -
PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths
Paper • 2502.14902 • Published • 1