huangjy-pku
·
AI & ML interests
General-purpose Vision, Multi-modal Learning, Scene Understanding, Embodied Agent
Organizations
-
-
-
-
-
-
-
-
-
-
-
view article
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment
upvoted
a
paper
over 1 year ago