An Exploration of Default Images in Text-to-Image Generation
Abstract
Research investigates default image generation in text-to-image models, analyzing patterns across thousands of images and user satisfaction impacts.
In the creative practice of text-to-image (TTI) generation, images are synthesized from textual prompts. By design, TTI models always yield an output, even if the prompt contains unknown terms. In this case, the model may generate default images: images that closely resemble each other across many unrelated prompts. Studying default images is valuable for designing better solutions for prompt engineering and TTI generation. We present the first investigation into default images on Midjourney. We describe an initial study in which we manually created input prompts triggering default images, and several ablation studies. Building on these, we conduct a computational analysis of over 750,000 images, revealing consistent default images across unrelated prompts. We also conduct an online user study investigating how default images may affect user satisfaction. Our work lays the foundation for understanding default images in TTI generation, highlighting their practical relevance as well as challenges and future research directions.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper