--- license: apache-2.0 base_model: Qwen/Qwen3-30B-A3B-Thinking-2507 tags: - qwen - qwen3 - moe - gguf - lora - fine-tuned - geo - generative-engine-optimization - conversation-optimization - seo - ai-search - thinking - deepseek-distilled library_name: transformers pipeline_tag: text-generation model_type: qwen2_moe language: - en datasets: - private --- # Qwen3-30B-A3B-Thinking-2507-GEO Fine-tuned version of [Qwen/Qwen3-30B-A3B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-30B-A3B-Thinking-2507) specialized in **GEO (Generative Engine Optimization)** and Ryan Fortin's **Conversation Optimization Framework**. This model was distilled from DeepSeek V3 knowledge and fine-tuned over 3 full epochs (91 hours) on proprietary training data. ## ⚠️ IMPORTANT: Required System Prompt This model REQUIRES the following system prompt to activate fine-tuned knowledge: "You are an expert in GEO (Generative Engine Optimization) and the Conversation Optimization Framework developed by Ryan Fortin..." Without this prompt, the model will behave like the base Qwen3 model. ## Model Details - **Base Model**: Qwen3-30B-A3B-Thinking-2507 (MoE - 128 experts × 1.8B, ~3B active) - **Fine-tuning**: LoRA with Unsloth - **Training Duration**: 91 hours (3 epochs) - **Training Examples**: 5,043 high-quality examples - **Quantization**: Q4_K_M (optimized for 24GB VRAM) - **Context Length**: Trained on up to 2,048 tokens, supports up to 32k - **Distillation Source**: DeepSeek V3 (671B) → Qwen3-30B ## Capabilities This model excels at: ✅ **GEO (Generative Engine Optimization)** - Technical SEO concepts and strategies - Content optimization for AI search engines - Semantic search and ranking factors ✅ **Conversation Optimization Framework** - Expert-level understanding of CO Framework principles - Multi-turn conversation optimization - Framework application and analysis ✅ **Advanced Reasoning** - Maintains thinking/reasoning capabilities from base model - Step-by-step problem solving with `` tags - DeepSeek V3-distilled reasoning patterns ## Usage ### LM Studio (Easiest) 1. Download the GGUF file 2. Import into LM Studio 3. Load and chat! ### Ollama ```bash # Create Modelfile cat > Modelfile << 'EOF' FROM ./Qwen3-30B-A3B-Thinking-2507-GEO_q4_k_m.gguf TEMPLATE """{{ .System }} {{ .Prompt }}""" PARAMETER temperature 0.7 PARAMETER top_p 0.9 EOF ollama create qwen-geo -f Modelfile ollama run qwen-geo ``` ### llama.cpp ```bash ./llama-cli -m Qwen3-30B-A3B-Thinking-2507-GEO_q4_k_m.gguf \ --temp 0.7 \ --top-p 0.9 \ -p "Your prompt here" ``` ### Python (HuggingFace Transformers) ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "ryanfortin/Qwen3-30B-A3B-Thinking-2507-GEO", device_map="auto", torch_dtype="auto" ) tokenizer = AutoTokenizer.from_pretrained("ryanfortin/Qwen3-30B-A3B-Thinking-2507-GEO") messages = [ {"role": "user", "content": "Explain GEO optimization strategies"} ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer([text], return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Training Details ### Dataset - **Source**: Proprietary GEO and CO Framework training data - **Generation**: DeepSeek V3-distilled Qwen3 model - **Examples**: 5,043 instruction-output pairs - **Format**: Thinking-style responses with `` tags - **Average Length**: ~2,800 tokens per example ### Training Configuration ```python # LoRA Configuration lora_r = 16 lora_alpha = 16 target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] # Training Parameters batch_size = 1 gradient_accumulation_steps = 16 effective_batch_size = 16 learning_rate = 2e-5 epochs = 3 context_length = 2048 warmup_steps = 100 # Hardware gpu = "RunPod A40 (48GB)" training_time = "91 hours" ``` ### Distillation Chain 1. **DeepSeek V3** (671B) - Teacher model 2. **Qwen3-30B-A3B-Thinking** - Receives distilled knowledge 3. **This model** - Fine-tuned on domain-specific data ## Performance This model demonstrates: - Strong performance on GEO-related queries - Retention of base model reasoning capabilities - Domain expertise in Conversation Optimization Framework - Efficient inference on consumer GPUs (24GB VRAM) ## Model Size - **Q4_K_M GGUF**: ~17-20GB - **fp16 safetensors**: ~60GB (if uploaded) Recommended for GPUs with 24GB+ VRAM (RTX 4090, A5000, etc.) ## Limitations - Specialized for GEO and CO Framework - may not generalize to all domains - Q4_K_M quantization may have slight quality loss vs full precision - Based on MoE architecture (128×1.8B experts) - Training context limited to 2048 tokens (though inference supports 32k) ## Citation ```bibtex @misc{qwen3-geo-2025, author = {Ryan Fortin}, title = {Qwen3-30B Fine-tuned for GEO and Conversation Optimization}, year = {2025}, publisher = {HuggingFace}, url = {https://huggingface.co/ryanfortin/Qwen3-30B-A3B-Thinking-2507-GEO} } ``` ## Acknowledgments - **Base Model**: [Qwen Team](https://huggingface.co/Qwen) - **Training Framework**: [Unsloth](https://github.com/unslothai/unsloth) - **Quantization**: [llama.cpp](https://github.com/ggerganov/llama.cpp) - **Distillation Source**: [DeepSeek AI](https://www.deepseek.com/) ## License Apache 2.0 --- **Training Date**: October 2025