Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty
Mehul Damani PRO
mehuldamani
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
published
a model
about 1 hour ago
mehuldamani/qwen3_8b_hotpot_rlvr_single
published
a model
about 1 hour ago
mehuldamani/qwen3_8b_hotpot_rlcr_single
updated
a dataset
about 2 hours ago
mehuldamani/judge-v1-step600
Organizations
None yet