-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 75 -
HuggingFaceFW/fineweb-2
Viewer β’ Updated β’ 4.48B β’ 90.6k β’ 700 -
Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks
π82Evaluate multilingual models using FineTasks
Collections
Discover the best community collections!
Collections including paper arxiv:2506.20920
-
NExT-GPT: Any-to-Any Multimodal LLM
Paper β’ 2309.05519 β’ Published β’ 78 -
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Paper β’ 2309.03883 β’ Published β’ 35 -
apple/DCLM-7B
7B β’ Updated β’ 234 β’ 832 -
Aria: An Open Multimodal Native Mixture-of-Experts Model
Paper β’ 2410.05993 β’ Published β’ 111
-
ahmedheakl/resume-atlas
Viewer β’ Updated β’ 13.4k β’ 213 β’ 10 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 75 -
Infinite Dataset Hub
βΎ279Search and save datasets generated with a LLM in real time
-
IntrEx: A Dataset for Modeling Engagement in Educational Conversations
Paper β’ 2509.06652 β’ Published β’ 24
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 75 -
HuggingFaceFW/fineweb-2
Viewer β’ Updated β’ 4.48B β’ 90.6k β’ 700 -
Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks
π82Evaluate multilingual models using FineTasks
-
ahmedheakl/resume-atlas
Viewer β’ Updated β’ 13.4k β’ 213 β’ 10 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 75 -
Infinite Dataset Hub
βΎ279Search and save datasets generated with a LLM in real time
-
IntrEx: A Dataset for Modeling Engagement in Educational Conversations
Paper β’ 2509.06652 β’ Published β’ 24
-
NExT-GPT: Any-to-Any Multimodal LLM
Paper β’ 2309.05519 β’ Published β’ 78 -
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Paper β’ 2309.03883 β’ Published β’ 35 -
apple/DCLM-7B
7B β’ Updated β’ 234 β’ 832 -
Aria: An Open Multimodal Native Mixture-of-Experts Model
Paper β’ 2410.05993 β’ Published β’ 111