Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
mehuldamani 's Collections
RLCR

RLCR

updated Aug 6

Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty

Upvote
5

  • mehuldamani/big-math-digits-v2-correctness

    Text Generation • 8B • Updated Jun 25 • 75

  • mehuldamani/hotpot-v2-correctness-7b

    Text Generation • 8B • Updated Jul 29 • 112

  • mehuldamani/orm-big-math-digits-v2-correctness

    Text Classification • 7B • Updated Jul 8 • 3

  • mehuldamani/big-math-digits-v2-brier

    8B • Updated Aug 4 • 17

  • mehuldamani/big-math-digits

    Viewer • Updated Aug 5 • 31k • 186

  • mehuldamani/hotpot_qa

    Viewer • Updated Aug 5 • 20.5k • 539

  • mehuldamani/hotpot-v2-brier-7b-no-split

    Text Generation • 8B • Updated Jun 5 • 107

  • mehuldamani/big-math-digits-v2-brier-base-tabc

    Text Generation • 8B • Updated Jun 28 • 58

  • mehuldamani/orm-hotpot-v2-final-correctness

    Text Classification • 7B • Updated Jun 9 • 11

  • mehuldamani/qwen-base-verifier-sft-v1

    Text Generation • 8B • Updated Jun 13 • 3
Upvote
5
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs