view article Article Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation By exploding-gradients β’ Sep 16 β’ 7
An efficient probabilistic hardware architecture for diffusion-like models Paper β’ 2510.23972 β’ Published 8 days ago β’ 3
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning Paper β’ 2510.25992 β’ Published 6 days ago β’ 39
view article Article 3+ Years of ML & Society at Hugging Face π€π€π§βπ€βπ§ By yjernite and 3 others β’ 7 days ago β’ 13
view article Article huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning 9 days ago β’ 53
view article Article Aligning to What? Rethinking Agent Generalization in MiniMax M2 By MiniMax-AI β’ 6 days ago β’ 21
gpt-oss-safeguard Collection gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss β’ 2 items β’ Updated 7 days ago β’ 55
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs Paper β’ 2402.12030 β’ Published Feb 19, 2024 β’ 3
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper β’ 2307.09288 β’ Published Jul 18, 2023 β’ 246
Huxley-GΓΆdel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine Paper β’ 2510.21614 β’ Published 12 days ago β’ 19
Training language models to follow instructions with human feedback Paper β’ 2203.02155 β’ Published Mar 4, 2022 β’ 24
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math Paper β’ 2504.21233 β’ Published Apr 30 β’ 49
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv 13 days ago β’ 115
Bridging Offline and Online Reinforcement Learning for LLMs Paper β’ 2506.21495 β’ Published Jun 26 β’ 3
CWM: An Open-Weights LLM for Research on Code Generation with World Models Paper β’ 2510.02387 β’ Published Sep 30 β’ 7
Environment Hub Collection A collection of OpenEnv-spec Environments β’ 5 items β’ Updated 13 days ago β’ 10