-
A Survey of Direct Preference Optimization
Paper • 2503.11701 • Published -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29 -
A Technical Survey of Reinforcement Learning Techniques for Large Language Models
Paper • 2507.04136 • Published -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 188
hongseokcho
dv1999
AI & ML interests
None yet
Organizations
None yet
RL/DPO
-
A Survey of Direct Preference Optimization
Paper • 2503.11701 • Published -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29 -
A Technical Survey of Reinforcement Learning Techniques for Large Language Models
Paper • 2507.04136 • Published -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 188
models
0
None public yet
datasets
0
None public yet