A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10 • 187
barc0/200k_HEAVY_gpt4o-description-gpt4omini-code_generated_problems Viewer • Updated Nov 2, 2024 • 139k • 292 • 11
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12 • 73
You Do Not Fully Utilize Transformer's Representation Capacity Paper • 2502.09245 • Published Feb 13 • 37
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations Paper • 2412.13171 • Published Dec 17, 2024 • 35
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published Dec 9, 2024 • 90