hugginglearners (fastai X Hugging Face Group 2022)

vumichien

authored 2 papers 3 months ago

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

Paper • 2509.25531 • Published Sep 29, 2025 • 8

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9, 2025 • 36

emre

authored a paper 10 months ago

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Paper • 2211.05100 • Published Nov 9, 2022 • 35

emre

posted an update 10 months ago

Post

3778

having trouble with auto train
hello there this is the first time i am testing auto train with a 1.8k SFT dataset. Howevery i am not quite sure the training is going smooth. Logs seem quite confusing, token did not match can not auth, generates confusing train splits, do you know how i can check my running job properly?
what is being used for training as data?
any ideas?

1 reply

·

vumichien

authored a paper 11 months ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 10

vumichien

authored a paper over 1 year ago

Consent in Crisis: The Rapid Decline of the AI Data Commons

Paper • 2407.14933 • Published Jul 20, 2024 • 14

morgan

posted an update over 1 year ago

Post

1334

Llama 3.1 405B Instruct beats GPT-4o on MixEval-Hard

Just ran MixEval for 405B, Sonnet-3.5 and 4o, with 405B landing right between the other two at 66.19

The GPT-4o result of 64.7 replicated locally but Sonnet-3.5 actually scored 70.25/69.45 in my replications 🤔 Still well ahead of the other 2 though.

Sammple of 1 of the eval calls here: https://wandb.ai/morgan/MixEval/weave/calls/07b05ae2-2ef5-4525-98a6-c59963b76fe1

Quick auto-logging tracing for openai-compatible clients and many more here: https://wandb.github.io/weave/quickstart/

vumichien

authored a paper over 1 year ago

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Paper • 2406.15877 • Published Jun 22, 2024 • 48

vumichien

authored a paper almost 2 years ago

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Paper • 2404.00399 • Published Mar 30, 2024 • 42

satpalsr

posted an update almost 2 years ago

Post

1954

Introducing Indic Chat!

Try out best opensource Indic LLMs now on https://www.indic.chat/

Models available:
• Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa-2.0
• GenVRadmin/AryaBhatta-GemmaOrca
• BhabhaAI/Gajendra-v0.1
• ai4bharat/Airavata

Additionally:

1. We open up our discord for everyone to collaborate & accelerate Indic LLMs: https://bhabha.ai/discord

2. We release ~600K rows filtered & Hindi translated version of OpenHermes-2.5 instruction dataset: BhabhaAI/openhermes-2.5-hindi

Also, thanks to our compute sponsor - Telugu LLM Labs & Bhabha AI in helping us serve models for Indic Chat.

If you’d like to be a sponsor too, checkout
https://www.indic.chat/sponsor

taisazero

authored a paper almost 2 years ago

Can Language Models Employ the Socratic Method? Experiments with Code Debugging

Paper • 2310.03210 • Published Oct 4, 2023

morgan

posted an update almost 2 years ago

Post

Fine-tuning LLMs is rad, but how do you manage all your checkpoints and evals in a production setting?

We partnered with @hamel to ship an Enterprise Model Management course packed full of learnings for those training, evaluating and deploying models at work.

Topics include:
- What webhooks are & how to use them to create integrations with different tools
- How to automate train -> eval runs
- Improving model governance and documentation
- Comparing candidate and baseline models
- Design patterns & recipes
- Lots more...

Would love to hear what you think!

👉 https://www.wandb.courses/courses/enterprise-model-management

vumichien

authored a paper almost 2 years ago

Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning

Paper • 2402.06619 • Published Feb 9, 2024 • 56

satpalsr

posted an update almost 2 years ago

Post

Introducing Gajendra!

An early release of our 7B Hindi-Hinglish-English Instruction fine-tuned language model.

Model: BhabhaAI/Gajendra-v0.1

We additionally explore ways to filter examples that can be translated from English to Hindi and are releasing initial versions of both dataset and model for it.

Model: BhabhaAI/Mistral-translation-classify
Dataset: BhabhaAI/translation-classify

Looking forward to collaborate with open source community to accelerate and release Hindi LLMs.

3 replies

·

morgan

posted an update almost 2 years ago

Post

Delighted to share a course I've learned a ton from about getting better outputs from LLMs

https://www.wandb.courses/courses/steering-language-models

We released it last Thursday (free) and at just 30 minutes of content total, its very information-dense with non-stop learnings covering important concepts around LLM validation, making your approach to LLM prompting more pythonic and quickly covers a basic RAG application at the end.

Would love to hear what ye think!

vumichien

authored 3 papers over 2 years ago

msivanes

updated a model almost 3 years ago

hugginglearners/ml-news-classify-fastai

Text Classification • Updated Mar 8, 2023 • 1

merve

updated a Space over 3 years ago

README

💻

AI & ML interests

Team members 214

hugginglearners's activity

README