MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
Paper
•
2409.05840
•
Published
•
49
Here are the pretrained weights and instruction tuning weights
| Model | Pretrained Projector | Base LLM | PT Data | IT Data | Download |
|---|---|---|---|---|---|
| MMEvol-Qwen2-7B | mm_projector | Qwen2-7B | LLaVA-Pretrain | MMEvol | ckpt |
| Model | MME_C | MMStar | HallBench | MathVista_mini | MMMU_val | AI2D | POPE | BLINK | RWQA |
|---|---|---|---|---|---|---|---|---|---|
| MMEvol-Qwen2-7B | 55.8 | 51.6 | 64.1 | 52.4 | 45.1 | 74.7 | 87.8 | 47.7 | 63.9 |
| Model | VQA_v2 | GQA | MIA | MMSInst |
|---|---|---|---|---|
| MMEvol-Qwen2-7B | 83.1 | 65.5 | 77.6 | 41.8 |
Llama 3 is licensed under the LLAMA 3 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.