X-GenGroup
/

PaCo-Reward-7B-Lora

@@ -1,6 +1,7 @@
 ---
 pipeline_tag: image-text-to-text
 library_name: transformers
 ---
 # PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
@@ -25,13 +26,100 @@ PaCo-RL argues that reinforcement learning offers a promising alternative for le
 Extensive experiments show that PaCo-Reward significantly improves alignment with human perceptions of visual consistency, and PaCo-GRPO achieves state-of-the-art consistency performance with improved training efficiency and stability.
 <div align="center">
-  <img src="https://github.com/X-GenGroup/PaCo-RL/raw/main/assets/readme_overview.png" alt="PaCo-RL Overview" width="800"/>
 </div>
-## Quick Start
 For detailed installation, training of the reward model (PaCo-Reward), and running the full RL training (PaCo-GRPO), please refer to the [official GitHub repository](https://github.com/X-GenGroup/PaCo-RL). The repository provides comprehensive documentation for each component.
 ## Model Zoo
 The PaCo-RL framework includes several models available on Hugging Face:

 ---
 pipeline_tag: image-text-to-text
 library_name: transformers
+license: apache-2.0
 ---
 # PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
 Extensive experiments show that PaCo-Reward significantly improves alignment with human perceptions of visual consistency, and PaCo-GRPO achieves state-of-the-art consistency performance with improved training efficiency and stability.
 <div align="center">
+  <img src="https://github.com/X-GenGroup/PaCo-RL/raw/main/assets/dataset_pipeline.png" alt="PaCo-RL Overview" width="800"/>
 </div>
+## Example Usage
 For detailed installation, training of the reward model (PaCo-Reward), and running the full RL training (PaCo-GRPO), please refer to the [official GitHub repository](https://github.com/X-GenGroup/PaCo-RL). The repository provides comprehensive documentation for each component.
+```python
+import torch
+from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
+from peft import PeftModel
+from qwen_vl_utils import process_vision_info
+# Load base model
+base_model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
+    "Qwen/Qwen2.5-VL-7B-Instruct",
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+processor = AutoProcessor.from_pretrained("Qwen/Qwen2.5-VL-7B-Instruct")
+# Load LoRA adapter
+model = PeftModel.from_pretrained(
+    base_model,
+    "X-GenGroup/PaCo-Reward-7B-Lora"
+)
+image1 = 'https://huggingface.co/X-GenGroup/PaCo-Reward-7B/resolve/main/images/image_1.jpg'
+image2 = 'https://huggingface.co/X-GenGroup/PaCo-Reward-7B/resolve/main/images/image_2.jpg'
+main_prompt = 'Generate multiple images portraying a medical scene of a dentist in scrubs. The images should include activities such as explaining oral hygiene to a patient, taking X-rays of teeth, cleaning teeth in a dental office, and filling a cavity during an appointment. The setting should depict a realistic dental clinic.'
+text_prompt = (
+    f"Given two subfigures generated based on the theme: \"{main_prompt}\", "
+    f"do the two images maintain consistency in terms of style, logic and identity? "
+    f"Answer \"Yes\" and \"No\" first, and then provide detailed reasons."
+)
+# Example: Compare whether two images are visually consistent
+messages_1 = [
+    {
+        "role": "user",
+        "content": [
+            {"type": "image", "image": image1},
+            {"type": "image", "image": image2},
+            {"type": "text", "text": text_prompt},
+        ],
+    }
+]
+# Preparation for inference
+text = processor.apply_chat_template(
+    messages_1, tokenize=False, add_generation_prompt=True
+)
+image_inputs, video_inputs = process_vision_info(messages_1)
+inputs = processor(
+    text=[text],
+    images=image_inputs,
+    videos=video_inputs,
+    padding=True,
+    return_tensors="pt",
+)
+inputs = inputs.to("cuda")
+# Inference: Calculate consistency score
+# Get logits for first token
+with torch.no_grad():
+    outputs = model(**inputs)
+    first_token_logits = outputs.logits[0, -1, :]  # Last position of prompt
+# Get token IDs for "Yes" and "No"
+yes_id = processor.tokenizer.encode("Yes", add_special_tokens=False)[0]
+no_id = processor.tokenizer.encode("No", add_special_tokens=False)[0]
+# Calculate probability
+yes_logit = first_token_logits[yes_id]
+no_logit = first_token_logits[no_id]
+yes_prob = torch.exp(yes_logit) / (torch.exp(yes_logit) + torch.exp(no_logit))
+# PaCo-Reward-7B and this model may differ in scores due to numerical precision
+print(f"Consistency Score (Yes Conditional Probability): {yes_prob.item():.4f}")
+# Inference: Generate detailed reasons
+generated_ids = model.generate(**inputs, max_new_tokens=512)
+generated_ids_trimmed = [
+    out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
+]
+output_text = processor.batch_decode(
+    generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
+)
+print(output_text[0])
+```
 ## Model Zoo
 The PaCo-RL framework includes several models available on Hugging Face: