Can't get Qwen Image to work
So I've been using nunchaku models for quite some time: flux dev, flux krea dev, flux fill - all worked flawlessly, thank you so much for them, btw.
I also used qwen-image-Q4_K_S.gguf (11.3 Gb) and it worked fine, took 3-4 minutes per image but it was perfectly stable and worked without issues.
But both svdq-fp4_r32-qwen-image.safetensors and svdq-fp4_r128-qwen-image.safetensors wouldn't work. The latter one just failed to load, the former one was much worse: it showered me with a plethora of red errors and made my PC lag horribly for several minutes during which I was unable to do pretty much anything. So far those are the only models I tried that acted so badly.
I have 32 Gb ram and 64gb pagefile, rtx 5070 (12 Gb vram).
The errors it reported:
10:04:00.488 [Info] User local requested 3 images with model 'svdq-fp4_r32-qwen-image.safetensors'...
10:07:30.586 [Warning] [ComfyUI-0/STDERR] !!! Exception during processing !!! CUDA error: out of memory
10:07:30.835 [Warning] [ComfyUI-0/STDERR] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
10:07:33.482 [Warning] [ComfyUI-0/STDERR] Traceback (most recent call last):
10:07:33.484 [Warning] [ComfyUI-0/STDERR] File "H:\SwarmUI\SwarmUI\dlbackend\comfy\ComfyUI\execution.py", line 496, in execute
10:07:33.486 [Warning] [ComfyUI-0/STDERR] output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
10:07:33.487 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10:07:33.489 [Warning] [ComfyUI-0/STDERR] File "H:\SwarmUI\SwarmUI\dlbackend\comfy\ComfyUI\execution.py", line 315, in get_output_data
10:07:33.490 [Warning] [ComfyUI-0/STDERR] return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
10:07:33.492 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10:07:33.494 [Warning] [ComfyUI-0/STDERR] File "H:\SwarmUI\SwarmUI\dlbackend\comfy\ComfyUI\execution.py", line 289, in _async_map_node_over_list
10:07:33.495 [Warning] [ComfyUI-0/STDERR] await process_inputs(input_dict, i)
10:07:33.502 [Warning] [ComfyUI-0/STDERR] File "H:\SwarmUI\SwarmUI\dlbackend\comfy\ComfyUI\execution.py", line 277, in process_inputs
10:07:33.503 [Warning] [ComfyUI-0/STDERR] result = f(**inputs)
10:07:33.505 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^
10:07:33.506 [Warning] [ComfyUI-0/STDERR] File "H:\SwarmUI\SwarmUI\src\BuiltinExtensions\ComfyUIBackend\DLNodes\ComfyUI-nunchaku\nodes\models\qwenimage.py", line 183, in load_model
10:07:33.507 [Warning] [ComfyUI-0/STDERR] model.model.diffusion_model.set_offload(cpu_offload_enabled)
10:07:33.509 [Warning] [ComfyUI-0/STDERR] File "H:\SwarmUI\SwarmUI\src\BuiltinExtensions\ComfyUIBackend\DLNodes\ComfyUI-nunchaku\models\qwenimage.py", line 794, in set_offload
10:07:33.511 [Warning] [ComfyUI-0/STDERR] self.offload_manager = CPUOffloadManager(
10:07:33.512 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^
10:07:33.516 [Warning] [ComfyUI-0/STDERR] File "H:\SwarmUI\SwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\nunchaku\models\utils.py", line 118, in init
10:07:33.518 [Warning] [ComfyUI-0/STDERR] self.set_device(device)
10:07:33.520 [Warning] [ComfyUI-0/STDERR] File "H:\SwarmUI\SwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\nunchaku\models\utils.py", line 159, in set_device
10:07:33.521 [Warning] [ComfyUI-0/STDERR] p.data = p.data.pin_memory()
10:07:33.523 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^
10:07:33.525 [Warning] [ComfyUI-0/STDERR] RuntimeError: CUDA error: out of memory
10:07:33.526 [Warning] [ComfyUI-0/STDERR] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
10:08:02.795 [Error] [BackendHandler] backend #0 failed to load model with error: ComfyUI execution error: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
10:08:02.796 [Warning] [BackendHandler] backend #0 failed to load model svdq-fp4_r32-qwen-image.safetensors
10:08:03.714 [Warning] [BackendHandler] All backends failed to load the model 'H:/SwarmUI/SwarmUI/Models/diffusion_models/svdq-fp4_r32-qwen-image.safetensors'! Cannot generate anything.
10:08:03.718 [Error] [BackendHandler] Backend request #1 failed: All available backends failed to load the model 'H:/SwarmUI/SwarmUI/Models/diffusion_models/svdq-fp4_r32-qwen-image.safetensors'.
Possible reason: ComfyUI execution error: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
10:08:03.723 [Error] [BackendHandler] Backend request #1 failed: All available backends failed to load the model 'H:/SwarmUI/SwarmUI/Models/diffusion_models/svdq-fp4_r32-qwen-image.safetensors'.
Possible reason: ComfyUI execution error: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
10:08:03.769 [Warning] [BackendHandler] All backends failed to load the model 'H:/SwarmUI/SwarmUI/Models/diffusion_models/svdq-fp4_r32-qwen-image.safetensors'! Cannot generate anything.
10:08:03.772 [Error] [BackendHandler] Backend request #2 failed: All available backends failed to load the model 'H:/SwarmUI/SwarmUI/Models/diffusion_models/svdq-fp4_r32-qwen-image.safetensors'.
Possible reason: ComfyUI execution error: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
10:08:03.779 [Error] [BackendHandler] Backend request #2 failed: All available backends failed to load the model 'H:/SwarmUI/SwarmUI/Models/diffusion_models/svdq-fp4_r32-qwen-image.safetensors'.
Possible reason: ComfyUI execution error: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
10:08:03.789 [Warning] [BackendHandler] All backends failed to load the model 'H:/SwarmUI/SwarmUI/Models/diffusion_models/svdq-fp4_r32-qwen-image.safetensors'! Cannot generate anything.
10:08:03.867 [Error] [BackendHandler] Backend request #3 failed: All available backends failed to load the model 'H:/SwarmUI/SwarmUI/Models/diffusion_models/svdq-fp4_r32-qwen-image.safetensors'.
Possible reason: ComfyUI execution error: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
10:08:03.870 [Error] [BackendHandler] Backend request #3 failed: All available backends failed to load the model 'H:/SwarmUI/SwarmUI/Models/diffusion_models/svdq-fp4_r32-qwen-image.safetensors'.
Possible reason: ComfyUI execution error: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
The previous times I tried to run it it gave a bunch of different errors but I didn't save those. "out of memory" was only mentioned the last time, there were no mentions of memory before.
What are your versions of nunchaku and ComfyUI-nunchaku? They both have to be latest.
Also, some users has to reinstall comfyui. You can find discussion about it in the github issues, for example here:
https://github.com/nunchaku-tech/ComfyUI-nunchaku/issues/527
Ehhh... I'm afraid I don't know what those versions are. I'm a grandma-tier technology user for all intents and purposes. :(
They're not mentioned in the starting cmd "log" and I can't find them anywhere on swarmui tabs.
Also thank you for the link but I also have no clue where to paste that use_pin_memory solution that is provided there. Googling didn't help.
UPD. Ok, found it
SwarmUI\SwarmUI\src\BuiltinExtensions\ComfyUIBackend\DLNodes\ComfyUI-nunchaku\nodes\models\qwenimage.py
line 183
Tried it.
- use_pin_memory=False,num_blocks_on_gpus=40 (like one user suggested on swarmui discord):
- r128 version was doing something/nothing for a few minutes until i interrupted it
- r32 version made my PC lag horribly for several minutes until i stopped it
- just use_pin_memory=False:
- r128 version made my PC lag for several minutes until it interrupted itself in the end with an error
- r32 version worked but everything is lagging and the results (albeit very fast) overall seem worse than what I get from qwen-image-Q4_K_S.gguf (which feels much more stable)
also in the last case my GPU is barely used (vram 21%), it seems like it shoved everything into ram for some reason (which, unlike vram, is filled to the brim).