Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation Paper • 2510.17354 • Published 22 days ago • 33
Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents Paper • 2411.06559 • Published Nov 10, 2024 • 16
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29 • 136
GRACE: Generative Representation Learning via Contrastive Policy Optimization Paper • 2510.04506 • Published Oct 6 • 10
Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window Paper • 2510.08276 • Published Oct 9 • 9
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs Paper • 2510.09201 • Published Oct 10 • 48
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers Paper • 2311.17136 • Published Nov 28, 2023 • 8
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research Paper • 2505.19253 • Published May 25 • 32
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines Paper • 2409.12959 • Published Sep 19, 2024 • 38
REX-RAG: Reasoning Exploration with Policy Correction in Retrieval-Augmented Generation Paper • 2508.08149 • Published Aug 11 • 2
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10 • 187
Towards a Unified View of Large Language Model Post-Training Paper • 2509.04419 • Published Sep 4 • 73
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens Paper • 2508.01191 • Published Aug 2 • 236
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent Paper • 2508.05748 • Published Aug 7 • 137
Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards Paper • 2506.11474 • Published Jun 13 • 17