2 14

Richard Pang

yzpang

https://yzpang.me

AI & ML interests

NLP, ML

Recent Activity

upvoted a paper 3 days ago

Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following

authored a paper 28 days ago

Prompt Curriculum Learning for Efficient LLM Post-Training

upvoted a paper 29 days ago

Prompt Curriculum Learning for Efficient LLM Post-Training

View all activity

Organizations

None yet

upvoted a paper 3 days ago

Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following

Paper • 2511.10507 • Published 4 days ago • 5

authored a paper 28 days ago

Prompt Curriculum Learning for Efficient LLM Post-Training

Paper • 2510.01135 • Published Oct 1 • 2

upvoted a paper 29 days ago

Prompt Curriculum Learning for Efficient LLM Post-Training

Paper • 2510.01135 • Published Oct 1 • 2

commented a paper about 1 year ago

Iterative Reasoning Preference Optimization

Paper • 2404.19733 • Published Apr 30, 2024 • 49 •

authored a paper over 1 year ago

Self-Taught Evaluators

Paper • 2408.02666 • Published Aug 5, 2024 • 30

upvoted a paper over 1 year ago

Self-Taught Evaluators

Paper • 2408.02666 • Published Aug 5, 2024 • 30

authored 2 papers over 1 year ago

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 90

Iterative Reasoning Preference Optimization

Paper • 2404.19733 • Published Apr 30, 2024 • 49

upvoted a paper over 1 year ago

Iterative Reasoning Preference Optimization

Paper • 2404.19733 • Published Apr 30, 2024 • 49

upvoted 2 papers almost 2 years ago

System-Level Natural Language Feedback

Paper • 2306.13588 • Published Jun 23, 2023 • 10

Debate Helps Supervise Unreliable Experts

Paper • 2311.08702 • Published Nov 15, 2023 • 1

authored a paper almost 2 years ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151

upvoted a paper almost 2 years ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151

authored 5 papers almost 2 years ago

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Paper • 2311.12022 • Published Nov 20, 2023 • 33

Extrapolative Controlled Sequence Generation via Iterative Refinement

Paper • 2303.04562 • Published Mar 8, 2023 • 1

Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples

Paper • 2305.15269 • Published May 24, 2023 • 1

SQuALITY: Building a Long-Document Summarization Dataset the Hard Way

Paper • 2205.11465 • Published May 23, 2022 • 1

QuALITY: Question Answering with Long Input Texts, Yes!

Paper • 2112.08608 • Published Dec 16, 2021 • 3

upvoted 2 papers almost 2 years ago

Reward Gaming in Conditional Text Generation

Paper • 2211.08714 • Published Nov 16, 2022 • 1

Extrapolative Controlled Sequence Generation via Iterative Refinement

Paper • 2303.04562 • Published Mar 8, 2023 • 1

Richard Pang

AI & ML interests

Recent Activity

Organizations

yzpang's activity