Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
mehuldamani 's Collections
RLCR

RLCR

updated Aug 6

Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty

Upvote
5

  • mehuldamani/big-math-digits-v2-correctness

    Text Generation • 8B • Updated Jun 25 • 62

  • mehuldamani/hotpot-v2-correctness-7b

    Text Generation • 8B • Updated Jul 29 • 51

  • mehuldamani/orm-big-math-digits-v2-correctness

    Text Classification • 7B • Updated Jul 8 • 8

  • mehuldamani/big-math-digits-v2-brier

    8B • Updated Aug 4 • 64

  • mehuldamani/big-math-digits

    Viewer • Updated Aug 5 • 31k • 236

  • mehuldamani/hotpot_qa

    Viewer • Updated Aug 5 • 20.5k • 528

  • mehuldamani/hotpot-v2-brier-7b-no-split

    Text Generation • 8B • Updated Jun 5 • 97

  • mehuldamani/big-math-digits-v2-brier-base-tabc

    Text Generation • 8B • Updated Jun 28 • 36

  • mehuldamani/orm-hotpot-v2-final-correctness

    Text Classification • 7B • Updated Jun 9 • 13

  • mehuldamani/qwen-base-verifier-sft-v1

    Text Generation • 8B • Updated Jun 13 • 5
Upvote
5
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs