FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning. Project Page: https://fapo-rl.github.io/
Ding
dyyyyyyyy
AI & ML interests
None yet
Recent Activity
new activity
13 days ago
dyyyyyyyy/FAPO-Critic:Add task categories, tags, paper link, and sample usage
new activity
13 days ago
dyyyyyyyy/FAPO-GenRM-4B:Improve model card: Add pipeline tag, library name, paper link, and abstract
authored
a paper
14 days ago
FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable
Reasoning
Organizations
ScaleQuest
We introduce ScaleQuest, a scalable and novel data synthesis method. Project Page: https://scalequest.github.io/
-
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch
Paper • 2410.18693 • Published • 42 -
dyyyyyyyy/ScaleQuest-Math
Viewer • Updated • 1M • 65 • 23 -
dyyyyyyyy/ScaleQuest-Code
Viewer • Updated • 157k • 30 • 4 -
dyyyyyyyy/ScaleQuest-Math-Qwen2.5
Viewer • Updated • 622k • 35
COLDQA
SCAN
We propose Self-Denoising Monte Carlo Annotation (SCAN), an efficient Process Reward Model (PRM) data synthesis and noise-tolerant learning framework.
GNER
We introduce GNER, a Generative Named Entity Recognition framework, which demonstrates enhanced zero-shot capabilities across unseen entity domains.
-
Rethinking Negative Instances for Generative Named Entity Recognition
Paper • 2402.16602 • Published • 3 -
dyyyyyyyy/GNER-LLaMA-7B
Text Generation • 7B • Updated • 2 • 5 -
dyyyyyyyy/GNER-T5-base
Text Generation • 0.2B • Updated • 5 • 2 -
dyyyyyyyy/GNER-T5-large
Text Generation • 0.8B • Updated • 3 • 2
FAPO
FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning. Project Page: https://fapo-rl.github.io/
SCAN
We propose Self-Denoising Monte Carlo Annotation (SCAN), an efficient Process Reward Model (PRM) data synthesis and noise-tolerant learning framework.
ScaleQuest
We introduce ScaleQuest, a scalable and novel data synthesis method. Project Page: https://scalequest.github.io/
-
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch
Paper • 2410.18693 • Published • 42 -
dyyyyyyyy/ScaleQuest-Math
Viewer • Updated • 1M • 65 • 23 -
dyyyyyyyy/ScaleQuest-Code
Viewer • Updated • 157k • 30 • 4 -
dyyyyyyyy/ScaleQuest-Math-Qwen2.5
Viewer • Updated • 622k • 35
GNER
We introduce GNER, a Generative Named Entity Recognition framework, which demonstrates enhanced zero-shot capabilities across unseen entity domains.
-
Rethinking Negative Instances for Generative Named Entity Recognition
Paper • 2402.16602 • Published • 3 -
dyyyyyyyy/GNER-LLaMA-7B
Text Generation • 7B • Updated • 2 • 5 -
dyyyyyyyy/GNER-T5-base
Text Generation • 0.2B • Updated • 5 • 2 -
dyyyyyyyy/GNER-T5-large
Text Generation • 0.8B • Updated • 3 • 2
COLDQA