Collections
Discover the best community collections!
Collections including paper arxiv:2505.09388
-
SeerAttention-R: Sparse Attention Adaptation for Long Reasoning
Paper • 2506.08889 • Published • 23 -
MiniCPM4: Ultra-Efficient LLMs on End Devices
Paper • 2506.07900 • Published • 92 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 262 -
OpenThoughts: Data Recipes for Reasoning Models
Paper • 2506.04178 • Published • 48
-
Skywork Open Reasoner 1 Technical Report
Paper • 2505.22312 • Published • 54 -
Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities
Paper • 2505.21191 • Published • 3 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 187 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 318
-
The Leaderboard Illusion
Paper • 2504.20879 • Published • 72 -
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Paper • 2505.09343 • Published • 72 -
LLMs for Engineering: Teaching Models to Design High Powered Rockets
Paper • 2504.19394 • Published • 14 -
Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions
Paper • 2504.19056 • Published • 18
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 132 -
Magistral
Paper • 2506.10910 • Published • 65 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 55
-
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88 -
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper • 2505.17612 • Published • 81 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 318 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 187
-
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset
Paper • 2505.09568 • Published • 98 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 318 -
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Paper • 2505.11049 • Published • 60 -
Emerging Properties in Unified Multimodal Pretraining
Paper • 2505.14683 • Published • 134
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 132 -
Magistral
Paper • 2506.10910 • Published • 65 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 55
-
SeerAttention-R: Sparse Attention Adaptation for Long Reasoning
Paper • 2506.08889 • Published • 23 -
MiniCPM4: Ultra-Efficient LLMs on End Devices
Paper • 2506.07900 • Published • 92 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 262 -
OpenThoughts: Data Recipes for Reasoning Models
Paper • 2506.04178 • Published • 48
-
Skywork Open Reasoner 1 Technical Report
Paper • 2505.22312 • Published • 54 -
Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities
Paper • 2505.21191 • Published • 3 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 187 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 318
-
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88 -
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper • 2505.17612 • Published • 81 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 318 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 187
-
The Leaderboard Illusion
Paper • 2504.20879 • Published • 72 -
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Paper • 2505.09343 • Published • 72 -
LLMs for Engineering: Teaching Models to Design High Powered Rockets
Paper • 2504.19394 • Published • 14 -
Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions
Paper • 2504.19056 • Published • 18
-
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset
Paper • 2505.09568 • Published • 98 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 318 -
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Paper • 2505.11049 • Published • 60 -
Emerging Properties in Unified Multimodal Pretraining
Paper • 2505.14683 • Published • 134