Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
whoisjones
's Collections
General NER training datasets
MastermindEval
MastermindEval
updated
Mar 7
Evaluating reasoning capabilities of LLMs using the game of Mastermind (paper is coming)
Upvote
-
flair/mastermind_35_mcq_random
Viewer
•
Updated
Mar 12
•
37.1k
•
155
flair/mastermind_46_mcq_random
Viewer
•
Updated
Mar 12
•
36.1k
•
153
flair/mastermind_46_mcq_close
Viewer
•
Updated
Mar 12
•
36.1k
•
153
flair/mastermind_24_mcq_random
Viewer
•
Updated
Mar 12
•
30.4k
•
154
flair/mastermind_24_mcq_close
Viewer
•
Updated
Mar 12
•
30.4k
•
148
flair/mastermind_35_mcq_close
Viewer
•
Updated
May 29
•
37.1k
•
165
Upvote
-
Share collection
View history
Collection guide
Browse collections