ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 139
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision Paper • 2403.09472 • Published Mar 14, 2024 • 1
Forward-Backward Reasoning in Large Language Models for Mathematical Verification Paper • 2308.07758 • Published Aug 15, 2023 • 4
DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality Paper • 2303.14585 • Published Mar 25, 2023