Collections
Discover the best community collections!
Collections including paper arxiv:2510.05592
-
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Paper • 2509.20427 • Published • 79 -
Tree Search for LLM Agent Reinforcement Learning
Paper • 2509.21240 • Published • 87 -
SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
Paper • 2510.06917 • Published • 34 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 121
-
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Paper • 2503.14734 • Published • 5 -
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Paper • 2401.02117 • Published • 33 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper • 2506.01844 • Published • 142 -
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding
Paper • 2506.16035 • Published • 88
-
CoRAG: Collaborative Retrieval-Augmented Generation
Paper • 2504.01883 • Published • 9 -
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Paper • 2504.08600 • Published • 32 -
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL
Paper • 2503.23157 • Published • 10 -
AI Agents: Evolution, Architecture, and Real-World Applications
Paper • 2503.12687 • Published • 2
-
GenEx: Generating an Explorable World
Paper • 2412.09624 • Published • 97 -
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Paper • 2501.02790 • Published • 9 -
Who's Your Judge? On the Detectability of LLM-Generated Judgments
Paper • 2509.25154 • Published • 29 -
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
Paper • 2509.25760 • Published • 55
-
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 116 -
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning
Paper • 2509.13305 • Published • 90 -
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use
Paper • 2510.05592 • Published • 102 -
Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents
Paper • 2510.14438 • Published • 13
-
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 69 -
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Paper • 2509.03403 • Published • 22 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23 -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Paper • 2509.00930 • Published • 4
-
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning
Paper • 2505.01441 • Published • 39 -
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
Paper • 2504.16078 • Published • 21 -
Emergent Agentic Transformer from Chain of Hindsight Experience
Paper • 2305.16554 • Published -
DiaTool-DPO: Multi-Turn Direct Preference Optimization for Tool-Augmented Large Language Models
Paper • 2504.02882 • Published • 7
-
Edify 3D: Scalable High-Quality 3D Asset Generation
Paper • 2411.07135 • Published • 1 -
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
Paper • 2410.02707 • Published • 47 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 151 -
LongCat-Video Technical Report
Paper • 2510.22200 • Published • 25
-
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 68 -
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
Paper • 2502.06060 • Published • 38 -
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper • 2502.14499 • Published • 192 -
SurveyX: Academic Survey Automation via Large Language Models
Paper • 2502.14776 • Published • 100
-
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 116 -
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning
Paper • 2509.13305 • Published • 90 -
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use
Paper • 2510.05592 • Published • 102 -
Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents
Paper • 2510.14438 • Published • 13
-
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Paper • 2509.20427 • Published • 79 -
Tree Search for LLM Agent Reinforcement Learning
Paper • 2509.21240 • Published • 87 -
SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
Paper • 2510.06917 • Published • 34 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 121
-
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 69 -
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Paper • 2509.03403 • Published • 22 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23 -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Paper • 2509.00930 • Published • 4
-
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Paper • 2503.14734 • Published • 5 -
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Paper • 2401.02117 • Published • 33 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper • 2506.01844 • Published • 142 -
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding
Paper • 2506.16035 • Published • 88
-
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning
Paper • 2505.01441 • Published • 39 -
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
Paper • 2504.16078 • Published • 21 -
Emergent Agentic Transformer from Chain of Hindsight Experience
Paper • 2305.16554 • Published -
DiaTool-DPO: Multi-Turn Direct Preference Optimization for Tool-Augmented Large Language Models
Paper • 2504.02882 • Published • 7
-
CoRAG: Collaborative Retrieval-Augmented Generation
Paper • 2504.01883 • Published • 9 -
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Paper • 2504.08600 • Published • 32 -
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL
Paper • 2503.23157 • Published • 10 -
AI Agents: Evolution, Architecture, and Real-World Applications
Paper • 2503.12687 • Published • 2
-
Edify 3D: Scalable High-Quality 3D Asset Generation
Paper • 2411.07135 • Published • 1 -
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
Paper • 2410.02707 • Published • 47 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 151 -
LongCat-Video Technical Report
Paper • 2510.22200 • Published • 25
-
GenEx: Generating an Explorable World
Paper • 2412.09624 • Published • 97 -
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Paper • 2501.02790 • Published • 9 -
Who's Your Judge? On the Detectability of LLM-Generated Judgments
Paper • 2509.25154 • Published • 29 -
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
Paper • 2509.25760 • Published • 55
-
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 68 -
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
Paper • 2502.06060 • Published • 38 -
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper • 2502.14499 • Published • 192 -
SurveyX: Academic Survey Automation via Large Language Models
Paper • 2502.14776 • Published • 100