FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning. Project Page: https://fapo-rl.github.io/
Ding
dyyyyyyyy
AI & ML interests
None yet
Recent Activity
liked
a Space
21 days ago
ISEEKYAN/megatron_memory_estimator
new activity
2 months ago
dyyyyyyyy/FAPO-Critic:Add task categories, tags, paper link, and sample usage
new activity
2 months ago
dyyyyyyyy/FAPO-GenRM-4B:Improve model card: Add pipeline tag, library name, paper link, and abstract
Organizations
ScaleQuest
We introduce ScaleQuest, a scalable and novel data synthesis method. Project Page: https://scalequest.github.io/
-
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch
Paper • 2410.18693 • Published • 42 -
dyyyyyyyy/ScaleQuest-Math
Viewer • Updated • 1M • 69 • 23 -
dyyyyyyyy/ScaleQuest-Code
Viewer • Updated • 157k • 34 • 4 -
dyyyyyyyy/ScaleQuest-Math-Qwen2.5
Viewer • Updated • 622k • 23
COLDQA
SCAN
We propose Self-Denoising Monte Carlo Annotation (SCAN), an efficient Process Reward Model (PRM) data synthesis and noise-tolerant learning framework.
GNER
We introduce GNER, a Generative Named Entity Recognition framework, which demonstrates enhanced zero-shot capabilities across unseen entity domains.
FAPO
FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning. Project Page: https://fapo-rl.github.io/
SCAN
We propose Self-Denoising Monte Carlo Annotation (SCAN), an efficient Process Reward Model (PRM) data synthesis and noise-tolerant learning framework.
ScaleQuest
We introduce ScaleQuest, a scalable and novel data synthesis method. Project Page: https://scalequest.github.io/
-
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch
Paper • 2410.18693 • Published • 42 -
dyyyyyyyy/ScaleQuest-Math
Viewer • Updated • 1M • 69 • 23 -
dyyyyyyyy/ScaleQuest-Code
Viewer • Updated • 157k • 34 • 4 -
dyyyyyyyy/ScaleQuest-Math-Qwen2.5
Viewer • Updated • 622k • 23
GNER
We introduce GNER, a Generative Named Entity Recognition framework, which demonstrates enhanced zero-shot capabilities across unseen entity domains.
COLDQA