--- license: other license_name: fair-ai-public-license-1.0-sd license_link: https://freedevproject.org/faipl-1.0-sd/ base_model: - Bluvoll/Experimental_EQ-VAE_NoobAI_tests - Laxhar/noobai-XL-Vpred-1.0 library_name: diffusers --- ![Без имени-1 копия](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/6Ac5cvhD-KBTe9zyktmh8.png) ## Model Details This is an Experimental Conversion of Noobai v-pred to Rectified Flow target, using EQ-VAE. ![rf vs vpred](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/e8Qi7d5DWtQQvOg3vvb2n.png) ### Model Description Model is a continuation of Noobai training on same dataset, with new diffusion target and few improvements to existing tag approach*. Given the scope of this undertaking, this is only an experimental version, utilizing only subset of full original data. Current state of model is acceptable for general and research purposes, like Image Generation, Finetuning, LoRA Training, and others. We will provide example settings for common style training approach below. Generally, model is fairly stable, but can suffer certain drawbacks coming from lack of training, like malformed understanding of certain tags and colors in our tests, but are not, or rarely observed, in normal prompts in practice. - **Developed by:** Cabal Research (Bluvoll, Anzhc) - **Funded by:** Community, Bluvoll - **License:** [fair-ai-public-license-1.0-sd](https://freedevproject.org/faipl-1.0-sd/) - **Finetuned from model:** [Noobai V-pred 1.0](https://huggingface.co/Laxhar/noobai-XL-Vpred-1.0) *Removed massive(in some cases over 6 tags) keep token`, introduced "protected tags", which allows for indiscriminate shuffling, while keeping tokens undroppable. ## Bias and Limitations Due to low budget(~150$ total), we have not been successful in fully stabilizing the model, so you can and will encounter some issues that we were not able to find in our tests, or were not able to address. That wouldn't be too different from the performance of other base models, but your mileage will vary. Most biases of official dataset will apply(Blue Archive, etc.). Some color biases were not reduced, or became more apparent due to some of the quirks in convergence of rectified flow from Noobai v-pred. We did our best to mitigate it by training a bit further, but you will encounter them in certain strong color prompts. Some colors are in unstable state and are hard to achieve due to unfortunate state of their convergence at current step (Black and dark in particular, for example, `dark` will not generate dark image, you need to prompt `dark theme` for that.) ## Model Output Examples ![01290-2943123450](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/D9PLWrawp_QEMZyk5dpbl.png) ![01292-1874776530](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/cN1UUxoywBcA7L3m22bQb.png) ![01291-2943123455](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/wXp2AtQA6ifv_50_2o7WE.png) ![01294-1874776532](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/ZVRCxcN-vjUUa6EPDkRsO.png) ![01296-1021002911](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/y8G-lW4AE7JcRpKu5G3Os.png) ![01298-2775208673](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/QUNU14SHnw0ILS1rCVHm1.png) ![01293-1874776531](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/NlrQ1ce66fGqet1GCQTTY.png) ![01295-1874776534](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/LwEQ3DvhL9GFDkK2zL12V.png) ![01297-2775208672](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/_jSMO7b7niZs_1stOOo-u.png) ![01287-37110566](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/pSgA6H158rxIQoe7DYsWQ.png) ![01274-3354982185](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/sR-JTq5GB573kmAA8Nisa.png) ## Recommendations ### Inference #### Comfy ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/IQFZizmP_NSbEMYE5LC7T.png) (Workflow is available alongside model in repo) Same as your normal inference, but with addition of SD3 sampling node, and optional conv padding node, which is required for correct edges(VAE and model has been trained with padded convs in vae, to allow for easier edge content learning.) Recommended Parameters: **Sampler**: Euler, Euler A, DPM++ SDE, etc. **Steps**: 20-28 **CFG**: 5-7 **Schedule**: Normal/Simple **Positive Quality Tags**: `masterpiece, best quality` **Negative Tags**: `worst quality, normal quality, bad anatomy` #### A1111 WebUI Recommended WebUI: [ReForge](https://github.com/Panchovix/stable-diffusion-webui-reForge) - has native support for both RF, and conv padding. Possible WebUIs: [ErsatzForge](https://github.com/DenOfEquity/ersatzForge) - Has native support for RF, but written in a hardcoded name-checking way, so will not work out of the box. I'm also not able to verify if approach is correct, but it worked after adding the model name to list of checked. **How to use in ReForge**: ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/UV5Yp66H7YlccdQqborPf.png) (ignore Sigma max field at the top, this is not used in RF) Support for RF in ReForge is being implemented through a built-in extension: ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/LpMF0lmC96X001Au9fFU_.png) Set parameters to that, and you're good to go. **How to turn on padding**: ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/GmieYDa5l1C9sUiN363xt.png) Turn this on, save, FULLY RELOAD the UI, by closing console and launching it again. This is required. Setting does not change until UI is fully reloaded. Recommended Parameters: **Sampler**: Euler A Comfy RF, Euler, DPM++ SDE Comfy, etc. **ALL VARIANTS MUST BE RF OR COMFY, IF AVAILABLE. In ComfyUI routing is automatic, but not in the case of WebUI.** **Steps**: 20-28 **CFG**: 5-7 **Schedule**: Normal/Simple **Positive Quality Tags**: `masterpiece, best quality` **Negative Tags**: `worst quality, normal quality, bad anatomy` **ADETAILER FIX FOR RF**: By default, Adetailer discards Advanced Model Sampling extension, which breaks RF. You need to add AMS to this part of settings: ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/RQMtfm5Xi3V7oNsqXoZJN.png) Add: `advanced_model_sampling_script,advanced_model_sampling_script_backported` to there. If that does not work, go into adetailer extension, find args.py, open it, replace _builtin_scripts like this: ![изображение](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/rmnS-i_kciJzTZmeR-mGP.png) Here is a copypaste for easy copy: ``` _builtin_script = ( "advanced_model_sampling_script", "advanced_model_sampling_script_backported", "hypertile_script", "soft_inpainting", ) ``` ## Training ### Model Composition (Relative to base it's trained from) Unet: Same CLIP L: Same, Frozen CLIP G: Same, Frozen VAE: Changed, new VAE - [EQB7](https://huggingface.co/Anzhc/MS-LC-EQ-D-VR_VAE) w/conv padding. ### Training Details (Base / quality-tuned) **Samples seen**(unbatched steps): ~2kk / ~400k **Learning Rate**: 2e-5 / 2e-5 **Effective Batch size**: 1280 (40 real * 4 accum * 8 devices) / 1280 (40 * 4 * 8) **Precision**: Full BF16 **Optimizer**: AdamW8bit with Kahan Summation **Weight Decay**: 0.01 **Schedule**: Constant with warmup **Timestep Sampling Strategy**: Logit-Normal (sometimes referred to as Lognorm), Shift 2.5 **Text Encoders**: Frozen **Keep Token**: False (Used "Protected Tags" instead), all tags are shuffled. **Tag Dropout**: 10% **Uncond Dropout**: 10% **Optimal Transport**: True **VAE Conv Padding**: True **VAE Shift**: 0.1726 **VAE Scale**: 0.1280 (Computed against ~80k of anime images prior to training. Scale is +- same as in base SDXL VAE (negligible difference), but drastically different Shift, 0.1726 vs ~1.60) #### Training Data "Original" Noobai data subset of ~2 million samples, then WAF* subset of ~20 thousand for quality tuning of this intermediate checkpoint. Tags were not changed, data was taken "as-is", as per the wishes of community. *WAF - Weighted Aesthetic Filter, our recent solution for filtering data based on input of multiple scoring models at the same time(at varied weight, adapted for their specific prediction classes/range), including specialized models for specific content. High general threshold was used, resulting in top ~5% of data being selected for quality tuning. ### LoRA Trainig Current base is highly trainable. We are mostly style trainers and finetuners, so we would give you current recommendation for that, from which you can derive settings you find reasonable based on your experience with other model types. My current style training settings (Anzhc): **Learning Rate**: tested up to **7.5e-4**, LoRA is still stable at that. Somehow. Prolonged training(300+ images for 50 epochs) at that LR did not result in degradation, likely can be pushed even further, likely up to 1e-3, at least at the batch im using. **Batch Size**: 144 (6 real * 24 accum), using SGA(Stochastic Gradient Accumulation) - without SGA I probably would lower accum to 4-8. **Optimizer**: Adamw8bit with Kahan summation **Schedule**: ReREX (Use REX for simplicity) **Precision**: Full BF16 **Weight Decay**: 0.02 **Timestep Sampling Strategy**: Logit-Normal, Shift 2.5 (Closest to what i use result-wise) **Dim/Alpha/Conv/Alpha**: 24/24/24/24 (Lycoris/Locon) **Text Encoders**: Frozen **Optimal Transport**: True **Expected Dataset Size**: 100 images (Can be even 10, but balance with repeats to roughly this target.) **Epochs**: 50 (Yes, even with 10 repeats. 500 effective epochs works just fine and doesn't break from my tests.) ### Hardware Model was trained on cloud 8xA100 node. ### Software Custom fork of [SD-Scripts](https://github.com/bluvoll/sd-scripts)(maintained by Bluvoll) ## Acknowledgements ### Special Thanks **To supporting individuals of the community, who have donated funds to kickstart this training.** - Itterative - Sab - Puzll - Kyonisus It wouldn't have happened without you at this scale. --- # Support If you wish to support our continuous effort of making waifus 0.2% better, you can do it here: **https://ko-fi.com/bluvoll**