Qwen3-VL-8B-Thinking - Gemini 3 Distill scale 6

Proper grad norm and alpha. Fixed template

Fine-tuned on 1k dataset distilled from Gemini 3.

Base Model

  • unsloth/Qwen3-VL-8B-Thinking

Training

  • Dataset: 1k Gemini 3 distillation samples
Downloads last month
86
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fremko/Qwen3-VL-8B-Thinking-norm

Finetuned
(6)
this model
Quantizations
1 model