Model Card for LLaVa-Phi-2-3B
Model Details
Model Description
- Developed by: LAION, SkunkworksAI & Ontocord
- Model type: LLaVA is an open-source chatbot trained by fine-tuning Phi-2 on GPT-generated multimodal instruction-following data.
It is an auto-regressive language model, based on the transformer architecture
- Finetuned from model: Phi-2
- License: MIT
Model Sources
Evaluation
Benchmarks
| Model |
Parameters |
SQA |
GQA |
TextVQA |
POPE |
| LLaVA-1.5 |
7.3B |
68.0 |
62.0 |
58.3 |
85.3 |
| MC-LLaVA-3B |
3B |
- |
49.6 |
38.59 |
- |
| LLaVA-Phi |
3B |
68.4 |
- |
48.6 |
85.0 |
| moondream1 |
1.6B |
- |
56.3 |
39.8 |
- |
| llava-phi-2-3b |
2.7B |
69.0 |
51.2 |
47.0 |
86.0 |
| llava-phi-2-3b-siglip |
2.7B |
70.15% |
52.56% |
47.99% |
87.00% |