File size: 2,752 Bytes
9c6ec75
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5b585bd
9c6ec75
 
 
9fb1650
9c6ec75
 
 
 
 
 
9fb1650
9c6ec75
 
 
 
 
 
 
 
 
 
d8d7501
9c6ec75
 
 
 
 
 
9fb1650
9c6ec75
 
fd3af18
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
pipeline_tag: text-to-image
license: apache-2.0
base_model:
- neta-art/Neta-Lumina
- Alpha-VLLM/Lumina-Image-2.0
tags:
  - stable-diffusion
  - text-to-image
  - comfyui
  - diffusion-single-file
---

# NetaYume Lumina Image v2.0
![NetaYume Lumina Image v2.0](./Example/Demo_v2.png)

---
**I. Introduction**

NetaYume Lumina is a text-to-image model fine-tuned from [Neta Lumina](https://huggingface.co/neta-art/Neta-Lumina), a high-quality anime-style image generation model developed by [Neta.art Lab](https://huggingface.co/neta-art). It builds upon [Lumina-Image-2.0](https://huggingface.co/Alpha-VLLM/Lumina-Image-2.0), an open-source base model released by the [Alpha-VLLM](https://huggingface.co/Alpha-VLLM) team at Shanghai AI Laboratory.

This model was trained with the goal of not only generating realistic human images but also producing high-quality anime-style images. Despite being fine-tuned on a specific dataset, it retains a significant amount of knowledge from the base model.

**Key Features:**
- **High-Quality Anime Generation**: Generates detailed anime-style images with sharp outlines, vibrant colors, and smooth shading.
- **Improved Character Understanding**: Better captures characters, especially those from the Danbooru dataset, resulting in more coherent and accurate character representations.
- **Enhanced Fine Details**: Accurately generates accessories, clothing textures, hairstyles, and background elements with greater clarity.


The file NetaYume_Lumina_v2_all_in_one.safetensors is an all-in-one file that contains the necessary weights for the VAE, text encoder, and image backbone to be used with ComfyUI.

---

**II. Model Components & Training Details**
- **Text Encoder**: Pre-trained **Gemma-2-2b**
- **Variational Autoencoder**: Pre-trained **Flux.1 dev's VAE**
- **Image Backbone**: Fine-tune **NetaLumina's Image Backbone**

---

**III. Suggestion**

**System Prompt:** This help you generate your desired images more easily by understanding and aligning with your prompts.

For anime-style images using Danbooru tags:
    
     You are an assistant designed to generate anime images based on textual prompts. 
    
     You are an assistant designed to generate high-quality images based on user prompts and  danbooru tags.

**Recommended Settings**
- CFG: 4–7
- Sampling Steps: 40-50
- Sampler:
	- Euler a (with scheduler: normal)
	- res_multistep (with scheduler: linear_quadratic)

---
**IV. Acknowledgments**
- [narugo1992](https://huggingface.co/narugo) – for the invaluable Danbooru dataset
- [Alpha-VLLM](https://huggingface.co/Alpha-VLLM) - for creating the a wonderful model!
- [Neta.art](https://huggingface.co/neta-art/Neta-Lumina) and his team – for openly sharing awesome model.