Qwen3-30B-A3B-Thinking-2507-GEO
Fine-tuned version of Qwen/Qwen3-30B-A3B-Thinking-2507 specialized in GEO (Generative Engine Optimization) and Ryan Fortin's Conversation Optimization Framework.
This model was distilled from DeepSeek V3 knowledge and fine-tuned over 3 full epochs (91 hours) on proprietary training data.
โ ๏ธ IMPORTANT: Required System Prompt
This model REQUIRES the following system prompt to activate fine-tuned knowledge:
"You are an expert in GEO (Generative Engine Optimization) and the Conversation Optimization Framework developed by Ryan Fortin..."
Without this prompt, the model will behave like the base Qwen3 model.
Model Details
- Base Model: Qwen3-30B-A3B-Thinking-2507 (MoE - 128 experts ร 1.8B, ~3B active)
- Fine-tuning: LoRA with Unsloth
- Training Duration: 91 hours (3 epochs)
- Training Examples: 5,043 high-quality examples
- Quantization: Q4_K_M (optimized for 24GB VRAM)
- Context Length: Trained on up to 2,048 tokens, supports up to 32k
- Distillation Source: DeepSeek V3 (671B) โ Qwen3-30B
Capabilities
This model excels at:
โ GEO (Generative Engine Optimization)
- Technical SEO concepts and strategies
- Content optimization for AI search engines
- Semantic search and ranking factors
โ Conversation Optimization Framework
- Expert-level understanding of CO Framework principles
- Multi-turn conversation optimization
- Framework application and analysis
โ Advanced Reasoning
- Maintains thinking/reasoning capabilities from base model
- Step-by-step problem solving with
<think>tags - DeepSeek V3-distilled reasoning patterns
Usage
LM Studio (Easiest)
- Download the GGUF file
- Import into LM Studio
- Load and chat!
Ollama
# Create Modelfile
cat > Modelfile << 'EOF'
FROM ./Qwen3-30B-A3B-Thinking-2507-GEO_q4_k_m.gguf
TEMPLATE """{{ .System }}
{{ .Prompt }}"""
PARAMETER temperature 0.7
PARAMETER top_p 0.9
EOF
ollama create qwen-geo -f Modelfile
ollama run qwen-geo
llama.cpp
./llama-cli -m Qwen3-30B-A3B-Thinking-2507-GEO_q4_k_m.gguf \
--temp 0.7 \
--top-p 0.9 \
-p "Your prompt here"
Python (HuggingFace Transformers)
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"ryanfortin/Qwen3-30B-A3B-Thinking-2507-GEO",
device_map="auto",
torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("ryanfortin/Qwen3-30B-A3B-Thinking-2507-GEO")
messages = [
{"role": "user", "content": "Explain GEO optimization strategies"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Dataset
- Source: Proprietary GEO and CO Framework training data
- Generation: DeepSeek V3-distilled Qwen3 model
- Examples: 5,043 instruction-output pairs
- Format: Thinking-style responses with
<think>tags - Average Length: ~2,800 tokens per example
Training Configuration
# LoRA Configuration
lora_r = 16
lora_alpha = 16
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"]
# Training Parameters
batch_size = 1
gradient_accumulation_steps = 16
effective_batch_size = 16
learning_rate = 2e-5
epochs = 3
context_length = 2048
warmup_steps = 100
# Hardware
gpu = "RunPod A40 (48GB)"
training_time = "91 hours"
Distillation Chain
- DeepSeek V3 (671B) - Teacher model
- Qwen3-30B-A3B-Thinking - Receives distilled knowledge
- This model - Fine-tuned on domain-specific data
Performance
This model demonstrates:
- Strong performance on GEO-related queries
- Retention of base model reasoning capabilities
- Domain expertise in Conversation Optimization Framework
- Efficient inference on consumer GPUs (24GB VRAM)
Model Size
- Q4_K_M GGUF: ~17-20GB
- fp16 safetensors: ~60GB (if uploaded)
Recommended for GPUs with 24GB+ VRAM (RTX 4090, A5000, etc.)
Limitations
- Specialized for GEO and CO Framework - may not generalize to all domains
- Q4_K_M quantization may have slight quality loss vs full precision
- Based on MoE architecture (128ร1.8B experts)
- Training context limited to 2048 tokens (though inference supports 32k)
Citation
@misc{qwen3-geo-2025,
author = {Ryan Fortin},
title = {Qwen3-30B Fine-tuned for GEO and Conversation Optimization},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/ryanfortin/Qwen3-30B-A3B-Thinking-2507-GEO}
}
Acknowledgments
- Base Model: Qwen Team
- Training Framework: Unsloth
- Quantization: llama.cpp
- Distillation Source: DeepSeek AI
License
Apache 2.0
Training Date: October 2025
- Downloads last month
- 26
4-bit
Model tree for ryanfortin/Qwen3-30B-A3B-Thinking-2507-GEO
Base model
Qwen/Qwen3-30B-A3B-Thinking-2507