Checkpoints "Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning" arxiv [2509.22601]
Yulei Qin
yolay
AI & ML interests
Medical Imaging, Computer Vision,
Language Models
Recent Activity
authored
a paper
42 minutes ago
LTD-Bench: Evaluating Large Language Models by Letting Them Draw
upvoted
a
paper
about 1 hour ago
Scalable Multi-Task Reinforcement Learning for Generalizable Spatial
Intelligence in Visuomotor Agents
upvoted
a
paper
about 1 hour ago
Agent Lightning: Train ANY AI Agents with Reinforcement Learning