view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels Aug 18, 2025 • 88
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! +10 Aug 5, 2025 • 507
view article Article Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨ +1 Jul 25, 2025 • 83
view article Article Asynchronous Robot Inference: Decoupling Action Prediction and Execution +6 Jul 10, 2025 • 48
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9, 2025 • 757
view article Article Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 Jul 1, 2025 • 132
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 Sep 18, 2024 • 272
view article Article 💥 Building a Vulnerable Bank MCP — Then Automating an Agent to Hack It Jun 18, 2025 • 8
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2, 2025 • 147
view changelog Changelog Xet is now the default storage option for new users and organizations May 23, 2025 • 74
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance Apr 16, 2025 • 59
view article Article Reduce, Reuse, Recycle: Why Open Source is a Win for Sustainability May 7, 2025 • 17
view article Article Falcon-Edge: A series of powerful, universal, fine-tunable 1.58bit language models. May 15, 2025 • 36
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference Jan 16, 2025 • 76