YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Qwen2.5-7B-Instruct GGUF Models

A collection of quantized Qwen2.5-7B-Instruct models in GGUF format, optimized for different hardware configurations and use cases.

🎯 Quick Start

Download Models

# Download all models
git lfs install
git clone https://huggingface.co/wanhin/qwen2.5-7b-instruct-gguf

# Or download specific models
wget https://huggingface.co/wanhin/qwen2.5-7b-instruct-gguf/resolve/main/qwen2.5-7b-instruct-q6_k.gguf
wget https://huggingface.co/wanhin/qwen2.5-7b-instruct-gguf/resolve/main/qwen2.5-7b-instruct-q4_k_m.gguf

Run Inference

# With llama.cpp
./main -m qwen2.5-7b-instruct-q6_k.gguf -n 512 --repeat_penalty 1.1

# With Python
python -c "
from llama_cpp import Llama
llm = Llama(model_path='./qwen2.5-7b-instruct-q6_k.gguf')
print(llm('Hello!', max_tokens=100)['choices'][0]['text'])
"

📦 Available Models

Model	Size	Quality	Use Case
`qwen2.5-7b-instruct.gguf`	13.5 GB	Original	Best quality
`qwen2.5-7b-instruct-q8_0.gguf`	7.5 GB	Very High	High quality
`qwen2.5-7b-instruct-q4_k_m.gguf`	4.4 GB	Medium	Fast inference

🎨 CAD Design Specialization

These models are fine-tuned for CAD design tasks and can convert natural language descriptions into structured JSON for 3D modeling operations.

📝 License

MIT License - see the model card for details.

Downloads last month: 47

GGUF

Model size

8B params

Architecture

qwen2

Hardware compatibility

4-bit

8-bit

View +1 variant

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support