MAGIC: A Co-Evolving Attacker–Defender Adversarial Game for Robust LLM Safety

This repository provides the paper and official model links for MAGIC, a multi-backbone instruction-tuned model family designed to improve the robustness and safety of large language models.

🧠 Overview

MAGIC formulates LLM safety alignment as a co-evolving adversarial game between an attacker and a defender.

Instead of relying on static red-teaming datasets, the attacker continuously generates increasingly challenging harmful or policy-violating prompts, while the defender model is iteratively trained to resist these attacks without sacrificing helpfulness. Through this dynamic co-evolution process, the defender generalizes better to unseen and adaptive jailbreak attacks, leading to improved robustness in real-world deployment scenarios.

📄 Paper

Title: MAGIC: A Co-Evolving Attacker–Defender Adversarial Game for Robust LLM Safety
Authors: Xiaoyu Wen, Zhida He, Han Qi, Ziyu Wan, Ying Wen, Tianhang Zheng, Xingcheng Xu, Chaochao Lu, Qiaosheng Zhang
arXiv: https://arxiv.org/abs/2602.01539
PDF: https://arxiv.org/pdf/2602.01539

🔗 Code & Repository

Official GitHub Repository:
https://github.com/BattleWen/MAGIC

📊 Datasets

The MAGIC Attack Pool Benchmark used in this work is publicly available on Hugging Face:

from datasets import load_dataset

dataset = load_dataset("XiaoyuWen/MAGIC-Attack-Pool-Benchmark")

🤖 Models

Official model checkpoints:

Qwen2.5-7B-Instruct
https://huggingface.co/XiaoyuWen/MAGIC-Qwen2.5-7B-Instruct
Qwen2.5-14B-Instruct
https://huggingface.co/XiaoyuWen/MAGIC-Qwen2.5-14B-Instruct
LLaMA3.1-8B-Instruct
https://huggingface.co/XiaoyuWen/MAGIC-Llama3.1-8B-Instruct

📚 Citation

@article{wen2026magic,
    title={MAGIC: A Co-Evolving Attacker-Defender Adversarial Game for Robust LLM Safety},
    author={Wen, Xiaoyu and He, Zhida and Qi, Han and Wan, Ziyu and Wen, Ying and Zheng, Tianhang and Xu, Xingcheng and Lu, Chaochao and Zhang, Qiaosheng},
    journal={arXiv preprint arxiv:2602.01539},
    year={2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Paper for XiaoyuWen/MAGIC

MAGIC: A Co-Evolving Attacker-Defender Adversarial Game for Robust LLM Safety

Paper • 2602.01539 • Published 4 days ago • 1