dyyyyyyyy commited on
Commit
d525519
·
verified ·
1 Parent(s): 9062fda

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -18
README.md CHANGED
@@ -1,18 +1,23 @@
1
- ---
2
- license: apache-2.0
3
- ---
4
-
5
- Generative Reward Model trained with [FAPO-Critic](https://huggingface.co/datasets/dyyyyyyyy/FAPO-Critic)
6
-
7
- ---
8
-
9
- Project Homepage: https://fapo-rl.github.io/
10
-
11
- Code Implementation: https://github.com/volcengine/verl/tree/main/recipe/fapo
12
-
13
- Welcome to follow and cite our works!
14
-
15
- BibTeX citation:
16
- ```bibtex
17
- comming soon
18
- ```
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ Generative Reward Model trained with [FAPO-Critic](https://huggingface.co/datasets/dyyyyyyyy/FAPO-Critic)
6
+
7
+ ---
8
+
9
+ Project Homepage: https://fapo-rl.github.io/
10
+
11
+ Code Implementation: https://github.com/volcengine/verl/tree/main/recipe/fapo
12
+
13
+ Welcome to follow and cite our works!
14
+
15
+ BibTeX citation:
16
+ ```bibtex
17
+ @article{ding2025fapo,
18
+ title={FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning},
19
+ author={Ding, Yuyang and Zhang, Chi and Li, Juntao and Lin, Haibin and Liu, Xin and Zhang, Min},
20
+ journal={arXiv preprint arXiv:2510.22543},
21
+ year={2025}
22
+ }
23
+ ```