ActiveVLN: Towards Active Exploration via Multi-Turn RL in Vision-and-Language Navigation Paper • 2509.12618 • Published Sep 16 • 1
LTD-Bench: Evaluating Large Language Models by Letting Them Draw Paper • 2511.02347 • Published Nov 4 • 8
Adaptive Dual Reasoner: Large Reasoning Models Can Think Efficiently by Hybrid Reasoning Paper • 2510.10207 • Published Oct 11
RoRecomp: Enhancing Reasoning Efficiency via Rollout Response Recomposition in Reinforcement Learning Paper • 2509.25958 • Published Sep 30
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space Paper • 2511.20102 • Published 14 days ago • 26
SoftCLIP: Softer Cross-modal Alignment Makes CLIP Stronger Paper • 2303.17561 • Published Mar 30, 2023
VITA-E: Natural Embodied Interaction with Concurrent Seeing, Hearing, Speaking, and Acting Paper • 2510.21817 • Published Oct 21 • 41
VITA-VLA: Efficiently Teaching Vision-Language Models to Act via Action Expert Distillation Paper • 2510.09607 • Published Oct 10 • 2
Aligning and Prompting Everything All at Once for Universal Visual Perception Paper • 2312.02153 • Published Dec 4, 2023
MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL Paper • 2312.11242 • Published Dec 18, 2023
MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL Paper • 2312.11242 • Published Dec 18, 2023
Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning Paper • 2509.22601 • Published Sep 26 • 29
TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill \& Decode Inference Paper • 2508.15881 • Published Aug 21 • 9
Youtu-GraphRAG: Vertically Unified Agents for Graph Retrieval-Augmented Complex Reasoning Paper • 2508.19855 • Published Aug 27 • 7
CoDiEmb: A Collaborative yet Distinct Framework for Unified Representation Learning in Information Retrieval and Semantic Textual Similarity Paper • 2508.11442 • Published Aug 15 • 3
Co-Salient Object Detection with Co-Representation Purification Paper • 2303.07670 • Published Mar 14, 2023
MMICT: Boosting Multi-Modal Fine-Tuning with In-Context Examples Paper • 2312.06363 • Published Dec 11, 2023 • 1
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema Paper • 2402.11811 • Published Feb 19, 2024
Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models Paper • 2402.19014 • Published Feb 29, 2024