X-GenGroup
/

PaCo-FLUX.1-Kontext-Lora

Text-to-Image

Diffusers

Safetensors

Model card Files Files and versions

xet

Community

Jayce-Ping commited on 8 days ago

Commit

2c89669

verified ·

1 Parent(s): 1bf9d9d

Update README.md

Browse files

Files changed (1) hide show

README.md +48 -52

README.md CHANGED Viewed

@@ -1,77 +1,69 @@
 ---
 license: apache-2.0
-pipeline_tag: image-to-image
 library_name: diffusers
 ---
 # PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
-This repository contains the official implementation of **PaCo-RL**, a comprehensive framework for consistent image generation.
-[\ud83d\udcda Paper](https://huggingface.co/papers/2512.04784) | [\ud83c\udf10 Project Page](https://x-gengroup.github.io/HomePage_PaCo-RL/) | [\ud83d\udcbb Code](https://github.com/X-GenGroup/PaCo-RL) | [\ud83e\udd17 Models & Data](https://huggingface.co/collections/X-GenGroup/paco-rl)
-PaCo-RL aims to preserve identities, styles, and logical coherence across multiple images, which is essential for applications such as storytelling and character design. It leverages reinforcement learning to learn complex and subjective visual criteria without large-scale datasets, by combining a specialized consistency reward model (PaCo-Reward) with an efficient RL algorithm (PaCo-GRPO).
-## Key Components
-- **PaCo-Reward**: A pairwise consistency evaluator trained on a large-scale dataset constructed via automated sub-figure pairing. It evaluates consistency through a generative, autoregressive scoring mechanism enhanced by task-aware instructions and CoT reasons.
-- **PaCo-GRPO**: An efficient RL optimization strategy that leverages a novel resolution-decoupled optimization to substantially reduce RL cost, alongside a log-tamed multi-reward aggregation mechanism that ensures balanced and stable reward optimization.
-## \ud83d\ude80 Quick Start
-For detailed instructions on installation, training the reward model, and running RL training, please refer to the [GitHub repository](https://github.com/X-GenGroup/PaCo-RL).
-### Installation
-```bash
-git clone https://github.com/X-GenGroup/PaCo-RL.git
-cd PaCo-RL
-```
-### Train Reward Model
-```bash
-cd PaCo-Reward
-conda create -n paco-reward python=3.12 -y
-conda activate paco-reward
-cd LLaMA-Factory && pip install -e ".[torch,metrics]" --no-build-isolation
-cd .. && bash train/paco_reward.sh
-```
-### Run RL Training
-```bash
-cd PaCo-GRPO
-conda create -n paco-grpo python=3.12 -y
-conda activate paco-grpo
-pip install -e .
-# Setup vLLM reward server
-conda create -n vllm python=3.12 -y
-conda activate vllm && pip install vllm
-export CUDA_VISIBLE_DEVICES=0
-export VLLM_MODEL_PATHS='X-GenGroup/PaCo-Reward-7B'
-export VLLM_MODEL_NAMES='Paco-Reward-7B'
-bash vllm_server/launch.sh
-# Start training
-export CUDA_VISIBLE_DEVICES=1,2,3,4,5,6,7
-conda activate paco-grpo
-bash scripts/single_node/train_flux.sh t2is
 ```
-## \ud83c\udf81 Model Zoo
 | Model | Type | HuggingFace |
 |-------|------|-------------|
-| **PaCo-Reward-7B** | Reward Model | [\ud83e\udd17 Link](https://huggingface.co/X-GenGroup/PaCo-Reward-7B) |
-| **PaCo-Reward-7B-Lora** | Reward Model (LoRA) | [\ud83e\udd17 Link](https://huggingface.co/X-GenGroup/PaCo-Reward-7B-Lora) |
-| **PaCo-FLUX.1-dev** | T2I Model (LoRA) | [\ud83e\udd17 Link](https://huggingface.co/X-GenGroup/PaCo-FLUX.1-dev-Lora) |
-| **PaCo-FLUX.1-Kontext-dev** | Image Editing Model (LoRA) | [\ud83e\udd17 Link](https://huggingface.co/X-GenGroup/PaCo-FLUX.1-Kontext-Lora) |
-| **PaCo-QwenImage-Edit** | Image Editing Model (LoRA) | [\ud83e\udd17 Link](https://huggingface.co/X-GenGroup/PaCo-Qwen-Image-Edit-Lora) |
-## Acknowledgement
-Our work is built upon [Flow-GRPO](https://github.com/yifan123/flow_grpo), [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory), [vLLM](https://github.com/vllm-project/vllm), and [Qwen2.5-VL](https://github.com/QwenLM/Qwen3-VL). We sincerely thank the authors for their valuable contributions to the community.
-## Citation
-If you find our work helpful or inspiring, please feel free to cite it:
 ```bibtex
 @misc{ping2025pacorladvancingreinforcementlearning,
       title={PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling},
@@ -82,4 +74,8 @@ If you find our work helpful or inspiring, please feel free to cite it:
       primaryClass={cs.CV},
       url={https://arxiv.org/abs/2512.04784},
 }
-```

 ---
 license: apache-2.0
+pipeline_tag: text-to-image
 library_name: diffusers
 ---
 # PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
+<div align="center">
+  <a href='https://arxiv.org/abs/2512.04784'><img src='https://img.shields.io/badge/ArXiv-red?logo=arxiv'></a>  &nbsp;
+  <a href='https://x-gengroup.github.io/HomePage_PaCo-RL/'><img src='https://img.shields.io/badge/ProjectPage-purple?logo=github'></a> &nbsp;
+  <a href="https://github.com/X-GenGroup/PaCo-RL"><img src="https://img.shields.io/badge/Code-9E95B7?logo=github"></a> &nbsp;
+  <a href='https://huggingface.co/collections/X-GenGroup/paco-rl'><img src='https://img.shields.io/badge/Data & Model-green?logo=huggingface'></a> &nbsp;
+</div>
+The model presented in [PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling](https://huggingface.co/papers/2512.04784).
+## 🌟 Overview
+**PaCo-RL** is a comprehensive framework for consistent image generation through reinforcement learning, addressing challenges in preserving identities, styles, and logical coherence across multiple images for storytelling and character design applications.
+### Key Components
+- **PaCo-Reward**: A pairwise consistency evaluator with task-aware instruction and CoT reasoning.
+- **PaCo-GRPO**: Efficient RL optimization with resolution-decoupled training and log-tamed multi-reward aggregation
+## Example Usage
+```python
+import torch
+from diffusers import FluxKontextPipeline
+from peft import PeftModel
+from diffusers.utils import load_image
+pipe = FluxKontextPipeline.from_pretrained(
+    "black-forest-labs/FLUX.1-Kontext-dev",
+    torch_dtype=torch.bfloat16,
+    device_map="cuda"
+)
+pipe.transformer = PeftModel.from_pretrained(
+    pipe.transformer,
+    'X-GenGroup/PaCo-FLUX.1-Kontext-Lora'
+)
+input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png")
+image = pipe(
+  image=input_image,
+  prompt="Add a blue hat to the cat",
+  guidance_scale=2.5
+).images[0]
 ```
+## 🎁 Model Zoo
 | Model | Type | HuggingFace |
 |-------|------|-------------|
+| **PaCo-Reward-7B** | Reward Model | [🤗 Link](https://huggingface.co/X-GenGroup/PaCo-Reward-7B) |
+| **PaCo-Reward-7B-Lora** | Reward Model (LoRA) | [🤗 Link](https://huggingface.co/X-GenGroup/PaCo-Reward-7B-Lora) |
+| **PaCo-FLUX.1-dev** | T2I Model (LoRA) | [🤗 Link](https://huggingface.co/X-GenGroup/PaCo-FLUX.1-dev-Lora) |
+| **PaCo-FLUX.1-Kontext-dev** | Image Editing Model (LoRA) | [🤗 Link](https://huggingface.co/X-GenGroup/PaCo-FLUX.1-Kontext-Lora) |
+| **PaCo-QwenImage-Edit** | Image Editing Model (LoRA) | [🤗 Link](https://huggingface.co/X-GenGroup/PaCo-Qwen-Image-Edit-Lora) |
+## ⭐ Citation
 ```bibtex
 @misc{ping2025pacorladvancingreinforcementlearning,
       title={PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling},
       primaryClass={cs.CV},
       url={https://arxiv.org/abs/2512.04784},
 }
+```
+<div align="center">
+  <sub>⭐ Star us on GitHub if you find PaCo-RL helpful!</sub>
+</div>