Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
12
1
ZHOU
TOBI-X
Follow
0 followers
·
1 following
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
27 days ago
Don't Waste Mistakes: Leveraging Negative RL-Groups via Confidence Reweighting
upvoted
a
paper
27 days ago
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense
upvoted
a
paper
27 days ago
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
View all activity
Organizations
None yet
models
2
Sort: Recently updated
TOBI-X/XGLM-finetune-real
Text Generation
•
Updated
Apr 17, 2023
•
1
TOBI-X/XGLM-partial-pair
Text Generation
•
Updated
Apr 17, 2023
•
1
datasets
0
None public yet