When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA Paper • 2510.04849 • Published Oct 6 • 111
Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning Paper • 2508.09726 • Published Aug 13 • 14
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens Paper • 2508.05305 • Published Aug 7 • 46
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Paper • 2507.22448 • Published Jul 30 • 65
Skywork-Reward-V2 Collection Scaling preference data curation to the extreme • 9 items • Updated Jul 4 • 23
Geopolitical biases in LLMs: what are the "good" and the "bad" countries according to contemporary language models Paper • 2506.06751 • Published Jun 7 • 71
Exploring the Latent Capacity of LLMs for One-Step Text Generation Paper • 2505.21189 • Published May 27 • 61
Quartet: Native FP4 Training Can Be Optimal for Large Language Models Paper • 2505.14669 • Published May 20 • 78
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 38 items • Updated 5 days ago • 57
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published May 14 • 72
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers Paper • 2504.20752 • Published Apr 29 • 92
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning Paper • 2504.17192 • Published Apr 24 • 120
RuOpinionNE-2024: Extraction of Opinion Tuples from Russian News Texts Paper • 2504.06947 • Published Apr 9 • 3
Gemma 3 QAT INT4 (from Flax) Collection These are converted from the official QAT INT4 Flax checkpoints on Kaggle. Supported formats: AutoAWQ, GGUF • 12 items • Updated Apr 6 • 6
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback Paper • 2406.09279 • Published Jun 13, 2024 • 3
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Paper • 2503.01307 • Published Mar 3 • 38