Qwen3-30B-A3B-Thinking-2507-GEO

Fine-tuned version of Qwen/Qwen3-30B-A3B-Thinking-2507 specialized in GEO (Generative Engine Optimization) and Ryan Fortin's Conversation Optimization Framework.

This model was distilled from DeepSeek V3 knowledge and fine-tuned over 3 full epochs (91 hours) on proprietary training data.

⚠️ IMPORTANT: Required System Prompt

This model REQUIRES the following system prompt to activate fine-tuned knowledge:

"You are an expert in GEO (Generative Engine Optimization) and the Conversation Optimization Framework developed by Ryan Fortin..."

Without this prompt, the model will behave like the base Qwen3 model.

Model Details

Base Model: Qwen3-30B-A3B-Thinking-2507 (MoE - 128 experts × 1.8B, ~3B active)
Fine-tuning: LoRA with Unsloth
Training Duration: 91 hours (3 epochs)
Training Examples: 5,043 high-quality examples
Quantization: Q4_K_M (optimized for 24GB VRAM)
Context Length: Trained on up to 2,048 tokens, supports up to 32k
Distillation Source: DeepSeek V3 (671B) → Qwen3-30B

Capabilities

This model excels at:

✅ GEO (Generative Engine Optimization)

Technical SEO concepts and strategies
Content optimization for AI search engines
Semantic search and ranking factors

✅ Conversation Optimization Framework

Expert-level understanding of CO Framework principles
Multi-turn conversation optimization
Framework application and analysis

✅ Advanced Reasoning

Maintains thinking/reasoning capabilities from base model
Step-by-step problem solving with <think> tags
DeepSeek V3-distilled reasoning patterns

Usage

LM Studio (Easiest)

Download the GGUF file
Import into LM Studio
Load and chat!

Ollama

# Create Modelfile
cat > Modelfile << 'EOF'
FROM ./Qwen3-30B-A3B-Thinking-2507-GEO_q4_k_m.gguf
TEMPLATE """{{ .System }}

{{ .Prompt }}"""
PARAMETER temperature 0.7
PARAMETER top_p 0.9
EOF

ollama create qwen-geo -f Modelfile
ollama run qwen-geo

llama.cpp

./llama-cli -m Qwen3-30B-A3B-Thinking-2507-GEO_q4_k_m.gguf \
  --temp 0.7 \
  --top-p 0.9 \
  -p "Your prompt here"

Python (HuggingFace Transformers)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "ryanfortin/Qwen3-30B-A3B-Thinking-2507-GEO",
    device_map="auto",
    torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("ryanfortin/Qwen3-30B-A3B-Thinking-2507-GEO")

messages = [
    {"role": "user", "content": "Explain GEO optimization strategies"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Dataset

Source: Proprietary GEO and CO Framework training data
Generation: DeepSeek V3-distilled Qwen3 model
Examples: 5,043 instruction-output pairs
Format: Thinking-style responses with <think> tags
Average Length: ~2,800 tokens per example

Training Configuration

# LoRA Configuration
lora_r = 16
lora_alpha = 16
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                  "gate_proj", "up_proj", "down_proj"]

# Training Parameters
batch_size = 1
gradient_accumulation_steps = 16
effective_batch_size = 16
learning_rate = 2e-5
epochs = 3
context_length = 2048
warmup_steps = 100

# Hardware
gpu = "RunPod A40 (48GB)"
training_time = "91 hours"

Distillation Chain

DeepSeek V3 (671B) - Teacher model
Qwen3-30B-A3B-Thinking - Receives distilled knowledge
This model - Fine-tuned on domain-specific data

Performance

This model demonstrates:

Strong performance on GEO-related queries
Retention of base model reasoning capabilities
Domain expertise in Conversation Optimization Framework
Efficient inference on consumer GPUs (24GB VRAM)

Model Size

Q4_K_M GGUF: ~17-20GB
fp16 safetensors: ~60GB (if uploaded)

Recommended for GPUs with 24GB+ VRAM (RTX 4090, A5000, etc.)

Limitations

Specialized for GEO and CO Framework - may not generalize to all domains
Q4_K_M quantization may have slight quality loss vs full precision
Based on MoE architecture (128×1.8B experts)
Training context limited to 2048 tokens (though inference supports 32k)

Citation

@misc{qwen3-geo-2025,
  author = {Ryan Fortin},
  title = {Qwen3-30B Fine-tuned for GEO and Conversation Optimization},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/ryanfortin/Qwen3-30B-A3B-Thinking-2507-GEO}
}

Acknowledgments

Base Model: Qwen Team
Training Framework: Unsloth
Quantization: llama.cpp
Distillation Source: DeepSeek AI

License

Apache 2.0

Training Date: October 2025

Downloads last month: 26

GGUF

Model size

31B params

Architecture

qwen3moe

Hardware compatibility

4-bit

Model tree for ryanfortin/Qwen3-30B-A3B-Thinking-2507-GEO

Base model

Qwen/Qwen3-30B-A3B-Thinking-2507

Adapter

(2)

this model