nielsr HF Staff commited on
Commit
68a39b1
Β·
verified Β·
1 Parent(s): d6d9e0c

Add model card with metadata and links

Browse files

This PR adds a comprehensive model card to the repository. It includes:
- Relevant metadata: `license: apache-2.0`, `pipeline_tag: text-to-image`, and `library_name: diffusers` (evidenced by `adapter_config.json`).
- Links to the paper ([PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling](https://huggingface.co/papers/2512.04784)), project page (https://x-gengroup.github.io/HomePage_PaCo-RL/), and GitHub repository (https://github.com/X-GenGroup/PaCo-RL/).
- A detailed overview, key components, repository structure, model zoo, acknowledgements, and citation, all extracted from the official GitHub README.

Please review and merge if everything looks good.

Files changed (1) hide show
  1. README.md +119 -0
README.md ADDED
@@ -0,0 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-to-image
4
+ library_name: diffusers
5
+ ---
6
+
7
+ # PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
8
+
9
+ The model presented in [PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling](https://huggingface.co/papers/2512.04784).
10
+
11
+ Project Page: https://x-gengroup.github.io/HomePage_PaCo-RL/
12
+ Code: https://github.com/X-GenGroup/PaCo-RL
13
+
14
+ <p align="center">
15
+ <b>Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling</b>
16
+ </p>
17
+
18
+ <div align="center">
19
+ <a href='https://arxiv.org/abs/2512.04784'><img src='https://img.shields.io/badge/ArXiv-red?logo=arxiv'></a> &nbsp;
20
+ <a href='https://x-gengroup.github.io/HomePage_PaCo-RL/'><img src='https://img.shields.io/badge/ProjectPage-purple?logo=github'></a> &nbsp;
21
+ <a href="https://github.com/X-GenGroup/PaCo-RL"><img src="https://img.shields.io/badge/Code-9E95B7?logo=github"></a> &nbsp;
22
+ <a href='https://huggingface.co/collections/X-GenGroup/paco-rl'><img src='https://img.shields.io/badge/Data & Model-green?logo=huggingface'></a> &nbsp;
23
+ </div>
24
+
25
+ ## 🌟 Overview
26
+
27
+ **PaCo-RL** is a comprehensive framework for consistent image generation through reinforcement learning, addressing challenges in preserving identities, styles, and logical coherence across multiple images for storytelling and character design applications.
28
+
29
+ ### Key Components
30
+
31
+ - **PaCo-Reward**: A pairwise consistency evaluator with task-aware instruction and CoT reasoning.
32
+ - **PaCo-GRPO**: Efficient RL optimization with resolution-decoupled training and log-tamed multi-reward aggregation
33
+
34
+ ## πŸš€ Quick Start
35
+
36
+ ### Installation
37
+ ```bash
38
+ git clone https://github.com/X-GenGroup/PaCo-RL.git
39
+ cd PaCo-RL
40
+ ```
41
+
42
+ ### Train Reward Model
43
+ ```bash
44
+ cd PaCo-Reward
45
+ conda create -n paco-reward python=3.12 -y
46
+ conda activate paco-reward
47
+ cd LLaMA-Factory && pip install -e ".[torch,metrics]" --no-build-isolation
48
+ cd .. && bash train/paco_reward.sh
49
+ ```
50
+
51
+ See πŸ“– [PaCo-Reward Documentation](PaCo-Reward/README.md) for detailed guide.
52
+
53
+ ### Run RL Training
54
+ ```bash
55
+ cd PaCo-GRPO
56
+ conda create -n paco-grpo python=3.12 -y
57
+ conda activate paco-grpo
58
+ pip install -e .
59
+
60
+ # Setup vLLM reward server
61
+ conda create -n vllm python=3.12 -y
62
+ conda activate vllm && pip install vllm
63
+ export CUDA_VISIBLE_DEVICES=0
64
+ export VLLM_MODEL_PATHS='X-GenGroup/PaCo-Reward-7B'
65
+ export VLLM_MODEL_NAMES='Paco-Reward-7B'
66
+ bash vllm_server/launch.sh
67
+
68
+ # Start training
69
+ export CUDA_VISIBLE_DEVICES=1,2,3,4,5,6,7
70
+ conda activate paco-grpo
71
+ bash scripts/single_node/train_flux.sh t2is
72
+ ```
73
+
74
+ See πŸ“– [PaCo-GRPO Documentation](PaCo-GRPO/README.md) for detailed guide.
75
+
76
+ ## πŸ“ Repository Structure
77
+ ```
78
+ PaCo-RL/
79
+ β”œβ”€β”€ PaCo-GRPO/ # RL training framework
80
+ β”‚ β”œβ”€β”€ config/ # RL configurations
81
+ β”‚ β”œβ”€β”€ scripts/ # Training scripts
82
+ β”‚ └── README.md
83
+ β”œβ”€β”€ PaCo-Reward/ # Reward model training
84
+ β”‚ β”œβ”€β”€ LLaMA-Factory/ # Training framework
85
+ β”‚ β”œβ”€β”€ config/ # Training configurations
86
+ β”‚ └── README.md
87
+ └── README.md
88
+ ```
89
+
90
+ ## 🎁 Model Zoo
91
+
92
+ | Model | Type | HuggingFace |
93
+ |-------|------|-------------|
94
+ | **PaCo-Reward-7B** | Reward Model | [πŸ€— Link](https://huggingface.co/X-GenGroup/PaCo-Reward-7B) |
95
+ | **PaCo-Reward-7B-Lora** | Reward Model (LoRA) | [πŸ€— Link](https://huggingface.co/X-GenGroup/PaCo-Reward-7B-Lora) |
96
+ | **PaCo-FLUX.1-dev** | T2I Model (LoRA) | [πŸ€— Link](https://huggingface.co/X-GenGroup/PaCo-FLUX.1-dev-Lora) |
97
+ | **PaCo-FLUX.1-Kontext-dev** | Image Editing Model (LoRA) | [πŸ€— Link](https://huggingface.co/X-GenGroup/PaCo-FLUX.1-Kontext-Lora) |
98
+ | **PaCo-QwenImage-Edit** | Image Editing Model (LoRA) | [πŸ€— Link](https://huggingface.co/X-GenGroup/PaCo-Qwen-Image-Edit-Lora) |
99
+
100
+ ## πŸ€— Acknowledgement
101
+
102
+ Our work is built upon [Flow-GRPO](https://github.com/yifan123/flow_grpo), [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory), [vLLM](https://github.com/vllm-project/vllm), and [Qwen2.5-VL](https://github.com/QwenLM/Qwen3-VL). We sincerely thank the authors for their valuable contributions to the community.
103
+
104
+ ## ⭐ Citation
105
+ ```bibtex
106
+ @misc{ping2025pacorladvancingreinforcementlearning,
107
+ title={PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling},
108
+ author={Bowen Ping and Chengyou Jia and Minnan Luo and Changliang Xia and Xin Shen and Zhuohang Dang and Hangwei Qian},
109
+ year={2025},
110
+ eprint={2512.04784},
111
+ archivePrefix={arXiv},
112
+ primaryClass={cs.CV},
113
+ url={https://arxiv.org/abs/2512.04784},
114
+ }
115
+ ```
116
+
117
+ <div align="center">
118
+ <sub>⭐ Star us on GitHub if you find PaCo-RL helpful!</sub>
119
+ </div>