--- license: apache-2.0 base_model: - duongve/NetaYume-Lumina-Image-2.0 pipeline_tag: text-to-image --- # Neta Cat Tower ## Introduction **Neta Cat Tower** is a text-to-image model fine-tuned from [NetaYume Lumina](https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0). This model was trained with the goal of enhancing anime style. No learning was conducted regarding the addition of characters. ### Model Description - **Developed by:** [nuko masshigura](https://huggingface.co/nukomasshigura) - **Model type:** Text-to-Image generative model based on [Neta Lumina](https://huggingface.co/neta-art/Neta-Lumina) - **License:** [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0) - **Finetuned from model:** [NetaYume Lumina](https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0) ### Model Components - Diffusion Transformers: This model - Text Encoder: [Pre-trained Gemma-2-2b](https://huggingface.co/neta-art/Neta-Lumina/blob/main/Text%20Encoder/gemma_2_2b_fp16.safetensors) - AutoEncoder: [Pre-trained Flux.1 dev's AE](https://huggingface.co/neta-art/Neta-Lumina/blob/main/VAE/ae.safetensors) "all_in_one" is a single model that are combined with DiT, text encoder and autoencoder. ## How to Get Started with the Model Please refer to the [Neta Lumina's model card.](https://huggingface.co/neta-art/Neta-Lumina) ### Recommended settings - Sampler: res_multistep/ euler_ancestral - Scheduler: linear_quadratic - Steps: >=30 - CFG (guidance): 4 – 5.5 - Resolution: 1024 × 1024, 768 × 1532, 968 × 1322, or >= 1024 ### Prompt Please refer to the [Neta Lumina Prompt Book](https://www.neta.art/blog/neta_lumina_prompt_book/) About character knowledge, please refer to the [NetaYume Lumina's Civitai page](https://civitai.com/models/1790792) ## Training Information ### v1 - base model: [NetaYume Lumina v3.5 (pre-trained)](https://civitai.com/models/1790792?modelVersionId=2298660) - dataset: 2.1k anime style dataset with danbooru tags and English captions - hardware: Geforce RTX5090 x 1 - training tool: sd-scripts - mixed_precision: bf16 - save_precision: fp16 - resolution: '1280,1280' - optimizer_type: AdamW8bit - learning_rate: 3e-5 - lr_scheduler: warmup_stable_decay - train_epochs: 20 - train_batch_size: 1 - gradient_accumulation_steps: 4 - min_snr_gamma: 5 - ip_noise_gamma: 0.1 - timestep_sampling: nextdit_shift ## Acknowledgments - duongve: Thanks to [duongve](https://huggingface.co/duongve) for sharing awesome model.