AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play Paper • 2509.24193 • Published Sep 29 • 6
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving Paper • 2507.06229 • Published Jul 8 • 75
MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at Scale Paper • 2506.04405 • Published Jun 4 • 7
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering Paper • 2505.07782 • Published May 12 • 19
MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning Paper • 2503.07459 • Published Mar 10 • 16
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training Paper • 2502.06589 • Published Feb 10 • 20
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search Paper • 2310.13227 • Published Oct 20, 2023 • 14