L^2M^3OF: A Large Language Multimodal Model for Metal-Organic Frameworks Paper • 2510.20976 • Published 16 days ago • 2 • 2
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29 • 136 • 3
The Invisible Leash: Why RLVR May Not Escape Its Origin Paper • 2507.14843 • Published Jul 20 • 84 • 9
The Invisible Leash: Why RLVR May Not Escape Its Origin Paper • 2507.14843 • Published Jul 20 • 84 • 9
The Invisible Leash: Why RLVR May Not Escape Its Origin Paper • 2507.14843 • Published Jul 20 • 84 • 9
The Invisible Leash: Why RLVR May Not Escape Its Origin Paper • 2507.14843 • Published Jul 20 • 84 • 9
When to Trust Context: Self-Reflective Debates for Context Reliability Paper • 2506.06020 • Published Jun 6 • 1 • 2