Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
voxnemo
's Collections
Datasets
Papers
Datasets
updated
Jul 9
Upvote
-
common-pile/arxiv_abstracts_filtered
Viewer
•
Updated
4 days ago
•
2.5M
•
124
•
5
common-pile/youtube_filtered
Viewer
•
Updated
Jun 6
•
986k
•
176
•
4
common-pile/wikiteam_filtered
Viewer
•
Updated
Jun 6
•
10.2M
•
485
common-pile/wikimedia_filtered
Viewer
•
Updated
Jun 6
•
12.9M
•
250
•
5
common-pile/uspto_filtered
Viewer
•
Updated
Jun 6
•
14.4M
•
517
•
3
common-pile/usgpo_filtered
Viewer
•
Updated
Jun 6
•
2.34M
•
166
•
1
common-pile/uk_hansard_filtered
Viewer
•
Updated
Jun 6
•
47.9k
•
71
•
1
common-pile/ubuntu_irc_filtered
Viewer
•
Updated
Jun 6
•
216k
•
61
•
1
common-pile/stackv2_html_filtered
Viewer
•
Updated
May 23
•
1.67M
•
187
•
2
common-pile/stackv2_edu_filtered
Viewer
•
Updated
Jun 6
•
57M
•
922
•
5
common-pile/stackexchange_filtered
Viewer
•
Updated
Jun 6
•
27.5M
•
488
•
6
common-pile/regulations_filtered
Viewer
•
Updated
Jun 6
•
192k
•
48
common-pile/python_enhancement_proposals_filtered
Viewer
•
Updated
Jun 6
•
655
•
55
•
1
common-pile/pubmed_filtered
Viewer
•
Updated
Jun 6
•
4.77M
•
264
•
2
common-pile/public_domain_review_filtered
Viewer
•
Updated
Jun 6
•
1.41k
•
25
common-pile/project_gutenberg_filtered
Viewer
•
Updated
Jun 6
•
57.1k
•
374
common-pile/pressbooks_filtered
Viewer
•
Updated
Jun 6
•
54.5k
•
63
common-pile/pre_1929_books_filtered
Viewer
•
Updated
Jun 6
•
122k
•
143
common-pile/peS2o_filtered
Viewer
•
Updated
Jun 6
•
6.09M
•
423
•
1
common-pile/oercommons_filtered
Viewer
•
Updated
Jun 6
•
5.25k
•
57
•
1
common-pile/news_filtered
Viewer
•
Updated
Jun 6
•
127k
•
37
•
1
common-pile/libretexts_filtered
Viewer
•
Updated
Jun 6
•
40k
•
75
•
1
common-pile/library_of_congress_filtered
Viewer
•
Updated
Jun 6
•
128k
•
267
•
2
common-pile/github_archive_filtered
Viewer
•
Updated
Jun 6
•
23.3M
•
208
•
1
common-pile/foodista_filtered
Preview
•
Updated
Jun 6
•
36
•
1
common-pile/doab_filtered
Viewer
•
Updated
Jun 6
•
404k
•
76
•
1
common-pile/data_provenance_initiative_filtered
Viewer
•
Updated
Jun 6
•
3.51M
•
60
common-pile/cccc_filtered
Viewer
•
Updated
Jun 6
•
10.8M
•
328
•
1
common-pile/caselaw_access_project_filtered
Viewer
•
Updated
Jun 6
•
5.5M
•
652
•
6
common-pile/biodiversity_heritage_library_filtered
Viewer
•
Updated
Jun 6
•
16.5M
•
139
•
1
common-pile/arxiv_papers_filtered
Viewer
•
Updated
Jun 6
•
309k
•
589
•
8
togethercomputer/RedPajama-Data-V2
Updated
Nov 21, 2024
•
5.88k
•
383
allenai/llama-3.1-tulu-3-405b-preference-mixture
Viewer
•
Updated
Feb 5
•
361k
•
250
•
6
HuggingFaceFW/fineweb-edu
Viewer
•
Updated
Jul 11
•
3.5B
•
219k
•
815
nvidia/Llama-Nemotron-Post-Training-Dataset
Viewer
•
Updated
May 8
•
3.91M
•
6.92k
•
603
open-thoughts/OpenThoughts3-1.2M
Viewer
•
Updated
Jun 9
•
1.2M
•
19.6k
•
181
Upvote
-
Share collection
View history
Collection guide
Browse collections