ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning Paper • 2510.27492 • Published 16 days ago • 78
Running 3.47k 3.47k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Spotlight on Token Perception for Multimodal Reinforcement Learning Paper • 2510.09285 • Published Oct 10 • 36
Diversity-Incentivized Exploration for Versatile Reasoning Paper • 2509.26209 • Published Sep 30 • 16
Native Hybrid Attention for Efficient Sequence Modeling Paper • 2510.07019 • Published Oct 8 • 16
Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration Paper • 2509.14760 • Published Sep 18 • 52
Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration Paper • 2509.14760 • Published Sep 18 • 52