RLCR - a mehuldamani Collection

mehuldamani 's Collections

RLCR

updated Aug 6

Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty