Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
anujga
's Collections
RL2
RecSys
rl-papers
Multi-lingual
Retrieval
Special
Aggregates
PT
Persona
Pt-classify
Sft
O1
Rl
Programming
Benchmark
Architecture
Datasets
Theory
agent
data/tool
data/vision
chemistry
PT
updated
Jun 24
Upvote
-
allenai/peS2o
Updated
Oct 13, 2024
•
2.27k
•
184
allenai/dolmino-mix-1124
Viewer
•
Updated
Oct 29
•
170M
•
25.3k
•
87
allenai/olmo-mix-1124
Viewer
•
Updated
Aug 19
•
621M
•
20.6k
•
82
Locutusque/UltraTextbooks
Viewer
•
Updated
Feb 2, 2024
•
5.52M
•
7.07k
•
196
PrimeIntellect/StackV1-popular
Viewer
•
Updated
Oct 8, 2024
•
93M
•
6.61k
•
2
EleutherAI/reasoning-mix
Viewer
•
Updated
Jan 24
•
11.7M
•
194
•
5
EleutherAI/the_pile_deduplicated
Viewer
•
Updated
Dec 2, 2022
•
134M
•
14.6k
•
105
HIT-TMG/KaLM-embedding-pretrain-data
Viewer
•
Updated
4 days ago
•
23.7M
•
1.53k
•
15
suriyagunasekar/stackoverflow-with-meta-data
Viewer
•
Updated
Feb 23, 2023
•
19.9M
•
1.22k
•
12
vesteinn/babylm
Viewer
•
Updated
Jul 3, 2023
•
13.6M
•
156
•
5
Salesforce/wikitext
Viewer
•
Updated
Jan 4, 2024
•
3.71M
•
965k
•
525
gk4u/reddit_dataset_104
Viewer
•
Updated
Apr 7
•
474M
•
73
•
3
EleutherAI/deep-ignorance-annealing-mix
Viewer
•
Updated
Aug 12
•
89M
•
377
•
1
Locutusque/TM-DATA-V2
Viewer
•
Updated
May 4, 2024
•
10.2M
•
32
•
5
Skywork/SkyPile-150B
Viewer
•
Updated
Dec 7, 2023
•
1.76M
•
6.1k
•
392
HuggingFaceTB/stack-edu
Viewer
•
Updated
Mar 20
•
167M
•
2.43k
•
57
Locutusque/deeplm-training-data
Viewer
•
Updated
Apr 11
•
2.17M
•
124
•
3
nvidia/Llama-Nemotron-Post-Training-Dataset
Viewer
•
Updated
May 8
•
3.91M
•
6.75k
•
608
LLM360/TxT360
Updated
May 26
•
28k
•
241
EssentialAI/essential-web-v1.0
Preview
•
Updated
Oct 2
•
21.3k
•
207
Upvote
-
Share collection
View history
Collection guide
Browse collections