Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
HuggingFaceFW 's Collections
🌐 FineWiki
πŸ“„ FinePDFs
πŸ₯‚ FineWeb2
🍷 FineWeb
πŸ“š FineWeb-Edu
πŸ“€ Dataset comparison models
πŸ§ͺ FineWeb v1 data experiments

πŸ₯‚ FineWeb2

updated Jun 27
Upvote
20

  • FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

    Paper β€’ 2506.20920 β€’ Published Jun 26 β€’ 74

  • HuggingFaceFW/fineweb-2

    Viewer β€’ Updated 7 days ago β€’ 4.48B β€’ 96.5k β€’ 679

  • Running
    74
    74

    Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

    πŸ“

    Evaluate multilingual models using FineTasks

Upvote
20
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs