-
MADD: Multi-Agent Drug Discovery Orchestra
Paper • 2511.08217 • Published • 55 -
The Station: An Open-World Environment for AI-Driven Discovery
Paper • 2511.06309 • Published • 35 -
An AI system to help scientists write expert-level empirical software
Paper • 2509.06503 • Published • 6 -
The Era of Agentic Organization: Learning to Organize with Language Models
Paper • 2510.26658 • Published • 26
Collections
Discover the best community collections!
Collections including paper arxiv:2510.25992
-
The Era of Agentic Organization: Learning to Organize with Language Models
Paper • 2510.26658 • Published • 26 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 44 -
The End of Manual Decoding: Towards Truly End-to-End Language Models
Paper • 2510.26697 • Published • 114
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 31 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 61 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 44 -
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Paper • 2511.07384 • Published • 16
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 98 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 60 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Paper • 2510.21970 • Published • 2
-
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Paper • 2508.07629 • Published • 42 -
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
Paper • 2508.07101 • Published • 13 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
Paper • 2508.08940 • Published • 27
-
MADD: Multi-Agent Drug Discovery Orchestra
Paper • 2511.08217 • Published • 55 -
The Station: An Open-World Environment for AI-Driven Discovery
Paper • 2511.06309 • Published • 35 -
An AI system to help scientists write expert-level empirical software
Paper • 2509.06503 • Published • 6 -
The Era of Agentic Organization: Learning to Organize with Language Models
Paper • 2510.26658 • Published • 26
-
The Era of Agentic Organization: Learning to Organize with Language Models
Paper • 2510.26658 • Published • 26 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 44 -
The End of Manual Decoding: Towards Truly End-to-End Language Models
Paper • 2510.26697 • Published • 114
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 98 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 60 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Paper • 2510.21970 • Published • 2
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 31 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 61 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 44 -
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Paper • 2511.07384 • Published • 16
-
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Paper • 2508.07629 • Published • 42 -
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
Paper • 2508.07101 • Published • 13 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
Paper • 2508.08940 • Published • 27