The list of hands-on notebooks (some beginner-friendly!) to get started with fine-tuning using TRL keeps growing!!
• SFT • GRPO • Tool calling & agents • RL environments with OpenEnv • LLMs and VLMs ✨ Many run on FREE Colab, making it super easy to get started fast!
The Christmas holidays are here! 🎄 Thinking about learning something new in AI?
@huggingface offers 12 FREE courses covering all the relevant topics, for every level of experience. A great challenge for the holidays (and worth saving for later 🙄)
Following up on LLaDA 2.0 , the paper is now out on Daily Papers🔥 It has sparked a lot of discussion in the community for showing how discrete diffusion LLMs can scale to 100B and run faster than traditional AR models. LLaDA2.0: Scaling Up Diffusion Language Models to 100B (2512.15745)
✨ Built from real enterprise data (Enron + financial institutions), not synthetic tasks ✨ Tests end-to-end finance workflows ✨ Multimodal & cross-file reasoning ✨ Expert annotated (700+ hours) and genuinely challenging hard
ICYMI, you can fine-tune open LLMs using Claude Code
just tell it: “Fine-tune Qwen3-0.6B on open-r1/codeforces-cots”
and Claude submits a real training job on HF GPUs using TRL.
it handles everything: > dataset validation > GPU selection > training + Trackio monitoring > job submission + cost estimation when it’s done, your model is on the Hub, ready to use
installama.sh at the TigerBeetle 1000x World Tour !
Last week I had the chance to give a short talk during the TigerBeetle 1000x World Tour (organized by @jedisct1 👏 ) a fantastic event celebrating high-performance engineering and the people who love pushing systems to their limits!
In the talk, I focused on the CPU and Linux side of things, with a simple goal in mind: making the installation of llama.cpp instant, automatic, and optimal, no matter your OS or hardware setup.
It comes packed with updates: > Agent training with tools in GRPO > New CISPO & SAPO losses + reasoning rewards > vLLM quantization in colocate mode > Dataset shuffling in SFT > Lots of NEW examples > Tons of fixes and documentation improvements