ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning Paper • 2510.27492 • Published 8 days ago • 77
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published 10 days ago • 70
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published 11 days ago • 95
InteractComp: Evaluating Search Agents With Ambiguous Queries Paper • 2510.24668 • Published 10 days ago • 96
E^2Rank: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker Paper • 2510.22733 • Published 12 days ago • 31
InteractComp: Evaluating Search Agents With Ambiguous Queries Paper • 2510.24668 • Published 10 days ago • 96
InteractComp: Evaluating Search Agents With Ambiguous Queries Paper • 2510.24668 • Published 10 days ago • 96 • 2
Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting Paper • 2505.19716 • Published May 26 • 4
You Don't Know Until You Click:Automated GUI Testing for Production-Ready Software Evaluation Paper • 2508.14104 • Published Aug 17 • 1
VeritasFi: An Adaptable, Multi-tiered RAG Framework for Multi-modal Financial Question Answering Paper • 2510.10828 • Published 26 days ago • 1
ReCode: Unify Plan and Action for Universal Granularity Control Paper • 2510.23564 • Published 11 days ago • 118
A Survey of Data Agents: Emerging Paradigm or Overstated Hype? Paper • 2510.23587 • Published 11 days ago • 65
ReCode: Unify Plan and Action for Universal Granularity Control Paper • 2510.23564 • Published 11 days ago • 118
From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning Paper • 2509.23768 • Published Sep 28 • 48
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data Paper • 2509.15221 • Published Sep 18 • 109
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19 • 118