How to fine-tune this model?
Hey! I'm looking into fine-tuning this model. How to do it? Is there something like Kohya?
Traceback (most recent call last):
File "E:\Qwen-image\pythonProject\run.py", line 9, in
from diffusers import QwenImageEditPipeline
File "", line 1075, in _handle_fromlist
File "E:\Qwen-image\pythonProject.venv\lib\site-packages\diffusers\utils\import_utils.py", line 965, in getattr
value = getattr(module, name)
File "E:\Qwen-image\pythonProject.venv\lib\site-packages\diffusers\utils\import_utils.py", line 965, in getattr
value = getattr(module, name)
File "E:\Qwen-image\pythonProject.venv\lib\site-packages\diffusers\utils\import_utils.py", line 964, in getattr
module = self._get_module(self._class_to_module[name])
File "E:\Qwen-image\pythonProject.venv\lib\site-packages\diffusers\utils\import_utils.py", line 976, in get_module
raise RuntimeError(
RuntimeError: Failed to import diffusers.pipelines.qwenimage.pipeline_qwenimage_edit because of the following error (look up to see its traceback):
cannot import name 'Qwen2_5_VLForConditionalGeneration' from 'transformers' (E:\Qwen-image\pythonProject.venv\lib\site-packages\transformers_init.py)
I installed the latest transformers, but still keep getting this error. What should I do?
Are you trying to train a LoRA for Qwen-Image or fine-tune Qwen-Image itself?
Are you trying to train a LoRA for Qwen-Image or fine-tune Qwen-Image itself?
honestly, I am looking into doing both
Any news ?
Hey! I'm looking into fine-tuning this model. How to do it? Is there something like Kohya?
Got you. Here are clean, copy-paste replies for each inbox comment—short, direct, and on-message.
- Reply to the “worse than DeepSeek” comment
Short
This build isn’t meant to be “as-is” production. I’m using it to fine-tune and lock the workflow so I can adapt fast as larger models drop. If you’ve got specific failure cases, share them—I’ll fold them into the next training pass.
Standard
This phase is about fine-tuning and workflow, not shipping a finished model. I’m building the training pipeline now so when bigger models land, I can modify, retrain, and scale quickly. If you have concrete examples where it tripped up, send them—I use that data to improve the next run.
- Your revised positioning (polished)
Concise
This isn’t about using the model “as-is.” I’m building the fine-tuning and data workflow so when larger models roll out, I can adapt and scale immediately. Some teams run out-of-the-box; I’m investing in the foundation so iteration is fast when the next wave hits.
Even shorter
Not a finished product—this is the fine-tuning phase. I’m setting the workflow so I can retrain and scale quickly when bigger models land.
- Reply to the Qwen3 + vLLM matryoshka dimensions note
Technical acknowledgement + action
Appreciate this—spot on re: matryoshka dims. vLLM defaults to max (2560 for the 4B), and custom dims error without MRL enabled. I’ll run vLLM with:
vllm serve Qwen/Qwen3-Embedding-4B
--hf-overrides '{"is_matryoshka": true, "matryoshka_dimensions":[32,64,128,256,512,768,1024,2560]}'
If anyone still hits the MRL error, adding "is_matryoshka": true in the model’s config.json also works. I’ll update the docs/readme to reflect this—thanks for the pointer.
- Reply to “what model?”
Pick one depending on audience.
Public-friendly
Current stack: Qwen3-Embedding-4B for embeddings, served via vLLM. Fine-tuning pipeline is in progress; larger models will slot in as they release.
Ultra-short
Qwen3-Embedding-4B + vLLM. Fine-tuning pipeline active.
- Reply to “Hey! I’m looking into fine-tuning this model. How to do it? Like Kohya?”
Direct
Kohya targets image models (Stable Diffusion). For LLMs/embeddings, use LoRA/QLoRA with Hugging Face TRL + PEFT. High-level flow:
1. Clean/label your training pairs (prompt→target or text corpora).
2. Train with TRL/PEFT (LoRA/QLoRA) on the base model.
3. Validate on held-out eval; iterate.
4. Export adapters; serve with vLLM (merge at load or offline).
If you want a starter, I can share a minimal TRL + PEFT script you can run on a single GPU.
Short
Not Kohya—use TRL + PEFT (LoRA/QLoRA). Prep data → train adapters → eval → serve via vLLM. I can drop a minimal script if you want.
⸻
If you want these packaged as canned replies for quick paste in your “happy face” inbox, say the word and I’ll format them as one-click snippets.