The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published 7 days ago • 113
BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping Paper • 2510.18927 • Published 16 days ago • 82
A Survey of Data Agents: Emerging Paradigm or Overstated Hype? Paper • 2510.23587 • Published 10 days ago • 65
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published 10 days ago • 172
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts Paper • 2510.19363 • Published 16 days ago • 59
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published 16 days ago • 110
DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper • 2510.21618 • Published 13 days ago • 92
ReCode: Unify Plan and Action for Universal Granularity Control Paper • 2510.23564 • Published 10 days ago • 117
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning Paper • 2509.13305 • Published Sep 16 • 89
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research Paper • 2509.13312 • Published Sep 16 • 105
InteractComp: Evaluating Search Agents With Ambiguous Queries Paper • 2510.24668 • Published 9 days ago • 96
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published 10 days ago • 95
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA Paper • 2510.04849 • Published Oct 6 • 111
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data Paper • 2509.15221 • Published Sep 18 • 109