--- license: apache-2.0 pipeline_tag: text-to-image library_name: diffusers --- # PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling

The model presented in [PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling](https://huggingface.co/papers/2512.04784). ## 🌟 Overview **PaCo-RL** is a comprehensive framework for consistent image generation through reinforcement learning, addressing challenges in preserving identities, styles, and logical coherence across multiple images for storytelling and character design applications. ### Key Components - **PaCo-Reward**: A pairwise consistency evaluator with task-aware instruction and CoT reasoning. - **PaCo-GRPO**: Efficient RL optimization with resolution-decoupled training and log-tamed multi-reward aggregation ## Example Usage ```python import torch from diffusers import FluxPipeline from peft import PeftModel pipe = FluxPipeline.from_pretrained( "black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16, device_map="cuda" ) pipe.transformer = PeftModel.from_pretrained( pipe.transformer, 'X-GenGroup/PaCo-FLUX.1-dev-Lora' ) main_prompt = "THREE-PANEL Images with a 1x3 grid layout Joker-themed posters inspired by Joaquin Phoenix's portrayal, unified through minimalist aesthetics. All posters use a minimalist style with bold outlines, textured muted green backgrounds, grunge effects, distressed yellow/red/blue/green accents, and include the header 'JOAQUIN PHOENIX' in small white capitals." sub_prompts = [ "[LEFT]: A poster dominated by oversized, distressed yellow 'JOKER' text spanning the upper half. The letters have jagged edges and subtle cracks, contrasting sharply against the muted green grunge background. Minimal supporting elements ensure the title commands full visual attention.", "[MIDDLE]: A poster symmetrically framed by 'OCTOBER 4' on the left and 'PUT ON A HAPPY FACE' on the right in crisp white text. Both phrases are aligned vertically with balanced spacing, flanking a central void filled only with faint grunge textures. Red and blue accents subtly underline the text blocks." "[RIGHT]: A poster centered on a stylized profile of the Joker's face with an exaggerated, sharp-edged smile. White base makeup contrasts with vivid red lips and blue triangular eye accents. His dark green hair merges with the background, while a red suit collar and yellow vest peek from below, rendered in flat minimalist shapes." ] prompt = main_prompt + " " + " ".join(sub_prompts) image = pipe( prompt, height=512, width=1536, guidance_scale=3.5, num_inference_steps=20, max_sequence_length=512, generator=torch.Generator("cuda").manual_seed(42) ).images[0] image.save("joker_posters.png") ```

## 🎁 Model Zoo | Model | Type | HuggingFace | |-------|------|-------------| | **PaCo-Reward-7B** | Reward Model | [🤗 Link](https://huggingface.co/X-GenGroup/PaCo-Reward-7B) | | **PaCo-Reward-7B-Lora** | Reward Model (LoRA) | [🤗 Link](https://huggingface.co/X-GenGroup/PaCo-Reward-7B-Lora) | | **PaCo-FLUX.1-dev** | T2I Model (LoRA) | [🤗 Link](https://huggingface.co/X-GenGroup/PaCo-FLUX.1-dev-Lora) | | **PaCo-FLUX.1-Kontext-dev** | Image Editing Model (LoRA) | [🤗 Link](https://huggingface.co/X-GenGroup/PaCo-FLUX.1-Kontext-Lora) | | **PaCo-QwenImage-Edit** | Image Editing Model (LoRA) | [🤗 Link](https://huggingface.co/X-GenGroup/PaCo-Qwen-Image-Edit-Lora) | ## ⭐ Citation ```bibtex @misc{ping2025pacorladvancingreinforcementlearning, title={PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling}, author={Bowen Ping and Chengyou Jia and Minnan Luo and Changliang Xia and Xin Shen and Zhuohang Dang and Hangwei Qian}, year={2025}, eprint={2512.04784}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2512.04784}, } ```

_{⭐ Star us on GitHub if you find PaCo-RL helpful!}