Evaluating Text-to-Visual Generation with Image-to-Text Generation
Paper
•
2404.01291
•
Published
•
6
This model is a fine-tuned version of google/flan-t5-xxl designed for image-text retrieval tasks, as presented in the VQAScore paper.
Base model
google/flan-t5-xxl