Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models Paper • 2511.08577 • Published 9 days ago • 85
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published 11 days ago • 109
Pre-training Dataset Samples Collection A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations. • 19 items • Updated 9 days ago • 13
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13 • 173
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published 21 days ago • 107
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published 21 days ago • 113
Energy-Based Transformers are Scalable Learners and Thinkers Paper • 2507.02092 • Published Jul 2 • 69
Transition Models: Rethinking the Generative Learning Objective Paper • 2509.04394 • Published Sep 4 • 28
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs Paper • 2508.16153 • Published Aug 22 • 154
CRISP: Persistent Concept Unlearning via Sparse Autoencoders Paper • 2508.13650 • Published Aug 19 • 15
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7 • 178