Jiafei Lyu
dmux
AI & ML interests
Reinforcement Learning
Recent Activity
upvoted
a
paper
about 16 hours ago
EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control
authored
a paper
8 months ago
GenPRM: Scaling Test-Time Compute of Process Reward Models via
Generative Reasoning
authored
a paper
over 1 year ago
SEABO: A Simple Search-Based Method for Offline Imitation Learning