iwasomen's picture

13

iwasomen

iwasomen

·

AI & ML interests

agent/rag/deep research

Organizations

upvoted 7 papers 3 months ago

Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval and Synthesis for SLMs

Paper • 2509.24107 • Published Sep 28, 2025 • 78

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 501

Apriel-1.5-15b-Thinker

Paper • 2510.01141 • Published Oct 1, 2025 • 119

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28, 2025 • 174

MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools

Paper • 2509.09734 • Published Sep 10, 2025 • 15

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18, 2025 • 114

WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning

Paper • 2509.13305 • Published Sep 16, 2025 • 91

upvoted 6 papers 4 months ago

WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

Paper • 2509.06501 • Published Sep 8, 2025 • 79

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9, 2025 • 101

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 190

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4, 2025 • 195

Reverse-Engineered Reasoning for Open-Ended Generation

Paper • 2509.06160 • Published Sep 7, 2025 • 150

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Paper • 2509.08721 • Published Sep 10, 2025 • 660