FineData

community
Activity Feed

AI & ML interests

We release large pre-training datasets to accelerate open LLM development. Part of the Hugging Face Science team (hf.co/science)

Recent Activity

HuggingFaceFW 's collections 7

๐Ÿ“š FineWeb-Edu
FineWeb-Edu datasets, classifier and ablation model
๐Ÿงช FineWeb v1 data experiments
Ablation models trained for our data experiments.
๐Ÿ“š FineWeb-Edu
FineWeb-Edu datasets, classifier and ablation model
๐Ÿ“€ Dataset comparison models
1.8B models trained on 350BT to compare different pretraining datasets
๐Ÿงช FineWeb v1 data experiments
Ablation models trained for our data experiments.