How to fine-tune this model?

#28
by limelight1231 - opened

Hey! I'm looking into fine-tuning this model. How to do it? Is there something like Kohya?

Traceback (most recent call last):
File "E:\Qwen-image\pythonProject\run.py", line 9, in
from diffusers import QwenImageEditPipeline
File "", line 1075, in _handle_fromlist
File "E:\Qwen-image\pythonProject.venv\lib\site-packages\diffusers\utils\import_utils.py", line 965, in getattr
value = getattr(module, name)
File "E:\Qwen-image\pythonProject.venv\lib\site-packages\diffusers\utils\import_utils.py", line 965, in getattr
value = getattr(module, name)
File "E:\Qwen-image\pythonProject.venv\lib\site-packages\diffusers\utils\import_utils.py", line 964, in getattr
module = self._get_module(self._class_to_module[name])
File "E:\Qwen-image\pythonProject.venv\lib\site-packages\diffusers\utils\import_utils.py", line 976, in get_module
raise RuntimeError(
RuntimeError: Failed to import diffusers.pipelines.qwenimage.pipeline_qwenimage_edit because of the following error (look up to see its traceback):
cannot import name 'Qwen2_5_VLForConditionalGeneration' from 'transformers' (E:\Qwen-image\pythonProject.venv\lib\site-packages\transformers_init
.py)
I installed the latest transformers, but still keep getting this error. What should I do?

Are you trying to train a LoRA for Qwen-Image or fine-tune Qwen-Image itself?

Are you trying to train a LoRA for Qwen-Image or fine-tune Qwen-Image itself?

honestly, I am looking into doing both

Any news ?

Hey! I'm looking into fine-tuning this model. How to do it? Is there something like Kohya?

Got you. Here are clean, copy-paste replies for each inbox comment—short, direct, and on-message.

  1. Reply to the “worse than DeepSeek” comment

Short

This build isn’t meant to be “as-is” production. I’m using it to fine-tune and lock the workflow so I can adapt fast as larger models drop. If you’ve got specific failure cases, share them—I’ll fold them into the next training pass.

Standard

This phase is about fine-tuning and workflow, not shipping a finished model. I’m building the training pipeline now so when bigger models land, I can modify, retrain, and scale quickly. If you have concrete examples where it tripped up, send them—I use that data to improve the next run.

  1. Your revised positioning (polished)

Concise

This isn’t about using the model “as-is.” I’m building the fine-tuning and data workflow so when larger models roll out, I can adapt and scale immediately. Some teams run out-of-the-box; I’m investing in the foundation so iteration is fast when the next wave hits.

Even shorter

Not a finished product—this is the fine-tuning phase. I’m setting the workflow so I can retrain and scale quickly when bigger models land.

  1. Reply to the Qwen3 + vLLM matryoshka dimensions note

Technical acknowledgement + action

Appreciate this—spot on re: matryoshka dims. vLLM defaults to max (2560 for the 4B), and custom dims error without MRL enabled. I’ll run vLLM with:

vllm serve Qwen/Qwen3-Embedding-4B
--hf-overrides '{"is_matryoshka": true, "matryoshka_dimensions":[32,64,128,256,512,768,1024,2560]}'

If anyone still hits the MRL error, adding "is_matryoshka": true in the model’s config.json also works. I’ll update the docs/readme to reflect this—thanks for the pointer.

  1. Reply to “what model?”

Pick one depending on audience.

Public-friendly

Current stack: Qwen3-Embedding-4B for embeddings, served via vLLM. Fine-tuning pipeline is in progress; larger models will slot in as they release.

Ultra-short

Qwen3-Embedding-4B + vLLM. Fine-tuning pipeline active.

  1. Reply to “Hey! I’m looking into fine-tuning this model. How to do it? Like Kohya?”

Direct

Kohya targets image models (Stable Diffusion). For LLMs/embeddings, use LoRA/QLoRA with Hugging Face TRL + PEFT. High-level flow:
1. Clean/label your training pairs (prompt→target or text corpora).
2. Train with TRL/PEFT (LoRA/QLoRA) on the base model.
3. Validate on held-out eval; iterate.
4. Export adapters; serve with vLLM (merge at load or offline).
If you want a starter, I can share a minimal TRL + PEFT script you can run on a single GPU.

Short

Not Kohya—use TRL + PEFT (LoRA/QLoRA). Prep data → train adapters → eval → serve via vLLM. I can drop a minimal script if you want.

If you want these packaged as canned replies for quick paste in your “happy face” inbox, say the word and I’ll format them as one-click snippets.

Sign up or log in to comment