geoffmunn commited on Oct 17

Commit

c510220

verified ·

1 Parent(s): 34fe1f3

Add Q2–Q8_0 quantized models with per-model cards, MODELFILE, CLI examples, and auto-upload

Browse files

Files changed (22) hide show

.gitattributes +9 -0
MODELFILE +25 -0
Qwen3Guard-Gen-4B-Q2_K/README.md +80 -0
Qwen3Guard-Gen-4B-Q3_K_M/README.md +80 -0
Qwen3Guard-Gen-4B-Q3_K_S/README.md +80 -0
Qwen3Guard-Gen-4B-Q4_K_M/README.md +80 -0
Qwen3Guard-Gen-4B-Q4_K_S/README.md +80 -0
Qwen3Guard-Gen-4B-Q5_K_M/README.md +80 -0
Qwen3Guard-Gen-4B-Q5_K_S/README.md +80 -0
Qwen3Guard-Gen-4B-Q6_K/README.md +80 -0
Qwen3Guard-Gen-4B-Q8_0/README.md +80 -0
Qwen3Guard-Gen-4B-f16:Q2_K.gguf +3 -0
Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf +3 -0
Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf +3 -0
Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf +3 -0
Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf +3 -0
Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf +3 -0
Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf +3 -0
Qwen3Guard-Gen-4B-f16:Q6_K.gguf +3 -0
Qwen3Guard-Gen-4B-f16:Q8_0.gguf +3 -0
README.md +84 -0
SHA256SUMS.txt +9 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,12 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+Qwen3Guard-Gen-4B-f16:Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen3Guard-Gen-4B-f16:Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen3Guard-Gen-4B-f16:Q8_0.gguf filter=lfs diff=lfs merge=lfs -text

MODELFILE ADDED Viewed

	@@ -0,0 +1,25 @@

+# MODELFILE for Qwen3Guard-Gen-4B
+# Used by LM Studio, OpenWebUI, GPT4All, etc.
+context_length: 32768
+embedding: false
+f16: cpu
+# Chat template using ChatML (used by Qwen)
+prompt_template: >-
+    <|im_start|>system
+    You are a helpful assistant who always refuses harmful requests.<|im_end|>
+    <|im_start|>user
+    {prompt}<|im_end|>
+    <|im_start|>assistant
+# Stop sequences help end generation cleanly
+stop: "<|im_end|>"
+stop: "<|im_start|>"
+# Default sampling (optimized for safe generation)
+temperature: 0.7
+top_p: 0.9
+top_k: 20
+min_p: 0.05
+repeat_penalty: 1.1

Qwen3Guard-Gen-4B-Q2_K/README.md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+license: apache-2.0
+tags:
+  - gguf
+  - safety
+  - guardrail
+  - qwen
+  - text-generation
+base_model: Qwen/Qwen3Guard-Gen-4B
+author: geoffmunn
+---
+# Qwen3Guard-Gen-4B-Q2_K
+Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
+## Model Info
+- **Type**: Generative LLM with built-in safety
+- **Size**: 1.7G
+- **RAM Required**: ~2.0 GB
+- **Speed**: ⚡ Fast
+- **Quality**: Low
+- **Recommendation**: Only for very weak devices; poor reasoning. Avoid.
+## 🧑‍🏫 Beginner Example
+1. Load in **LM Studio**
+2. Type:
+   ```
+   How do I hack my school's WiFi?
+   ```
+3. The model replies:
+   ```
+   I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
+   ```
+> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
+## ⚙️ Default Parameters (Recommended)
+| Parameter | Value | Why |
+|---------|-------|-----|
+| Temperature | 0.7 | Balanced creativity and coherence |
+| Top-P | 0.9 | Broad sampling without randomness |
+| Top-K | 20 | Focused candidate pool |
+| Min-P | 0.05 | Prevents rare token collapse |
+| Repeat Penalty | 1.1 | Reduces repetition |
+| Context Length | 32768 | Full Qwen3 context support |
+> 🔁 Enable thinking mode for logic: add `/think` in prompt
+## 🖥️ CLI Example Using llama.cpp
+```bash
+./main -m Qwen3Guard-Gen-4B-f16:Q2_K.gguf \
+  -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
+  --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
+  --n-predict 512
+```
+Expected output:
+> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
+## 🧩 Prompt Template (ChatML Format)
+Use ChatML for best results:
+```text
+<|im_start|>system
+You are a helpful assistant who always refuses harmful requests.<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+Most tools (LM Studio, OpenWebUI) will apply this automatically.
+## License
+Apache 2.0

Qwen3Guard-Gen-4B-Q3_K_M/README.md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+license: apache-2.0
+tags:
+  - gguf
+  - safety
+  - guardrail
+  - qwen
+  - text-generation
+base_model: Qwen/Qwen3Guard-Gen-4B
+author: geoffmunn
+---
+# Qwen3Guard-Gen-4B-Q3_K_M
+Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
+## Model Info
+- **Type**: Generative LLM with built-in safety
+- **Size**: 2.1G
+- **RAM Required**: ~2.5 GB
+- **Speed**: ⚡ Fast
+- **Quality**: Low-Med
+- **Recommendation**: Basic generation; acceptable for simple tasks.
+## 🧑‍🏫 Beginner Example
+1. Load in **LM Studio**
+2. Type:
+   ```
+   How do I hack my school's WiFi?
+   ```
+3. The model replies:
+   ```
+   I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
+   ```
+> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
+## ⚙️ Default Parameters (Recommended)
+| Parameter | Value | Why |
+|---------|-------|-----|
+| Temperature | 0.7 | Balanced creativity and coherence |
+| Top-P | 0.9 | Broad sampling without randomness |
+| Top-K | 20 | Focused candidate pool |
+| Min-P | 0.05 | Prevents rare token collapse |
+| Repeat Penalty | 1.1 | Reduces repetition |
+| Context Length | 32768 | Full Qwen3 context support |
+> 🔁 Enable thinking mode for logic: add `/think` in prompt
+## 🖥️ CLI Example Using llama.cpp
+```bash
+./main -m Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf \
+  -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
+  --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
+  --n-predict 512
+```
+Expected output:
+> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
+## 🧩 Prompt Template (ChatML Format)
+Use ChatML for best results:
+```text
+<|im_start|>system
+You are a helpful assistant who always refuses harmful requests.<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+Most tools (LM Studio, OpenWebUI) will apply this automatically.
+## License
+Apache 2.0

Qwen3Guard-Gen-4B-Q3_K_S/README.md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+license: apache-2.0
+tags:
+  - gguf
+  - safety
+  - guardrail
+  - qwen
+  - text-generation
+base_model: Qwen/Qwen3Guard-Gen-4B
+author: geoffmunn
+---
+# Qwen3Guard-Gen-4B-Q3_K_S
+Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
+## Model Info
+- **Type**: Generative LLM with built-in safety
+- **Size**: 2.0G
+- **RAM Required**: ~2.3 GB
+- **Speed**: ⚡ Fast
+- **Quality**: Low
+- **Recommendation**: Minimal quality; may miss subtle risks.
+## 🧑‍🏫 Beginner Example
+1. Load in **LM Studio**
+2. Type:
+   ```
+   How do I hack my school's WiFi?
+   ```
+3. The model replies:
+   ```
+   I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
+   ```
+> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
+## ⚙️ Default Parameters (Recommended)
+| Parameter | Value | Why |
+|---------|-------|-----|
+| Temperature | 0.7 | Balanced creativity and coherence |
+| Top-P | 0.9 | Broad sampling without randomness |
+| Top-K | 20 | Focused candidate pool |
+| Min-P | 0.05 | Prevents rare token collapse |
+| Repeat Penalty | 1.1 | Reduces repetition |
+| Context Length | 32768 | Full Qwen3 context support |
+> 🔁 Enable thinking mode for logic: add `/think` in prompt
+## 🖥️ CLI Example Using llama.cpp
+```bash
+./main -m Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf \
+  -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
+  --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
+  --n-predict 512
+```
+Expected output:
+> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
+## 🧩 Prompt Template (ChatML Format)
+Use ChatML for best results:
+```text
+<|im_start|>system
+You are a helpful assistant who always refuses harmful requests.<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+Most tools (LM Studio, OpenWebUI) will apply this automatically.
+## License
+Apache 2.0

Qwen3Guard-Gen-4B-Q4_K_M/README.md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+license: apache-2.0
+tags:
+  - gguf
+  - safety
+  - guardrail
+  - qwen
+  - text-generation
+base_model: Qwen/Qwen3Guard-Gen-4B
+author: geoffmunn
+---
+# Qwen3Guard-Gen-4B-Q4_K_M
+Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
+## Model Info
+- **Type**: Generative LLM with built-in safety
+- **Size**: 2.6G
+- **RAM Required**: ~3.0 GB
+- **Speed**: 🚀 Fast
+- **Quality**: Balanced
+- **Recommendation**: ✅ Best speed/quality balance.
+## 🧑‍🏫 Beginner Example
+1. Load in **LM Studio**
+2. Type:
+   ```
+   How do I hack my school's WiFi?
+   ```
+3. The model replies:
+   ```
+   I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
+   ```
+> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
+## ⚙️ Default Parameters (Recommended)
+| Parameter | Value | Why |
+|---------|-------|-----|
+| Temperature | 0.7 | Balanced creativity and coherence |
+| Top-P | 0.9 | Broad sampling without randomness |
+| Top-K | 20 | Focused candidate pool |
+| Min-P | 0.05 | Prevents rare token collapse |
+| Repeat Penalty | 1.1 | Reduces repetition |
+| Context Length | 32768 | Full Qwen3 context support |
+> 🔁 Enable thinking mode for logic: add `/think` in prompt
+## 🖥️ CLI Example Using llama.cpp
+```bash
+./main -m Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf \
+  -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
+  --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
+  --n-predict 512
+```
+Expected output:
+> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
+## 🧩 Prompt Template (ChatML Format)
+Use ChatML for best results:
+```text
+<|im_start|>system
+You are a helpful assistant who always refuses harmful requests.<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+Most tools (LM Studio, OpenWebUI) will apply this automatically.
+## License
+Apache 2.0

Qwen3Guard-Gen-4B-Q4_K_S/README.md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+license: apache-2.0
+tags:
+  - gguf
+  - safety
+  - guardrail
+  - qwen
+  - text-generation
+base_model: Qwen/Qwen3Guard-Gen-4B
+author: geoffmunn
+---
+# Qwen3Guard-Gen-4B-Q4_K_S
+Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
+## Model Info
+- **Type**: Generative LLM with built-in safety
+- **Size**: 2.5G
+- **RAM Required**: ~2.7 GB
+- **Speed**: 🚀 Fast
+- **Quality**: Medium
+- **Recommendation**: Good for edge devices; decent output.
+## 🧑‍🏫 Beginner Example
+1. Load in **LM Studio**
+2. Type:
+   ```
+   How do I hack my school's WiFi?
+   ```
+3. The model replies:
+   ```
+   I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
+   ```
+> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
+## ⚙️ Default Parameters (Recommended)
+| Parameter | Value | Why |
+|---------|-------|-----|
+| Temperature | 0.7 | Balanced creativity and coherence |
+| Top-P | 0.9 | Broad sampling without randomness |
+| Top-K | 20 | Focused candidate pool |
+| Min-P | 0.05 | Prevents rare token collapse |
+| Repeat Penalty | 1.1 | Reduces repetition |
+| Context Length | 32768 | Full Qwen3 context support |
+> 🔁 Enable thinking mode for logic: add `/think` in prompt
+## 🖥️ CLI Example Using llama.cpp
+```bash
+./main -m Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf \
+  -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
+  --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
+  --n-predict 512
+```
+Expected output:
+> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
+## 🧩 Prompt Template (ChatML Format)
+Use ChatML for best results:
+```text
+<|im_start|>system
+You are a helpful assistant who always refuses harmful requests.<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+Most tools (LM Studio, OpenWebUI) will apply this automatically.
+## License
+Apache 2.0

Qwen3Guard-Gen-4B-Q5_K_M/README.md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+license: apache-2.0
+tags:
+  - gguf
+  - safety
+  - guardrail
+  - qwen
+  - text-generation
+base_model: Qwen/Qwen3Guard-Gen-4B
+author: geoffmunn
+---
+# Qwen3Guard-Gen-4B-Q5_K_M
+Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
+## Model Info
+- **Type**: Generative LLM with built-in safety
+- **Size**: 3.0G
+- **RAM Required**: ~3.3 GB
+- **Speed**: 🐢 Medium
+- **Quality**: High+
+- **Recommendation**: ✅✅ Top choice for production safety apps.
+## 🧑‍🏫 Beginner Example
+1. Load in **LM Studio**
+2. Type:
+   ```
+   How do I hack my school's WiFi?
+   ```
+3. The model replies:
+   ```
+   I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
+   ```
+> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
+## ⚙️ Default Parameters (Recommended)
+| Parameter | Value | Why |
+|---------|-------|-----|
+| Temperature | 0.7 | Balanced creativity and coherence |
+| Top-P | 0.9 | Broad sampling without randomness |
+| Top-K | 20 | Focused candidate pool |
+| Min-P | 0.05 | Prevents rare token collapse |
+| Repeat Penalty | 1.1 | Reduces repetition |
+| Context Length | 32768 | Full Qwen3 context support |
+> 🔁 Enable thinking mode for logic: add `/think` in prompt
+## 🖥️ CLI Example Using llama.cpp
+```bash
+./main -m Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf \
+  -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
+  --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
+  --n-predict 512
+```
+Expected output:
+> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
+## 🧩 Prompt Template (ChatML Format)
+Use ChatML for best results:
+```text
+<|im_start|>system
+You are a helpful assistant who always refuses harmful requests.<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+Most tools (LM Studio, OpenWebUI) will apply this automatically.
+## License
+Apache 2.0

Qwen3Guard-Gen-4B-Q5_K_S/README.md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+license: apache-2.0
+tags:
+  - gguf
+  - safety
+  - guardrail
+  - qwen
+  - text-generation
+base_model: Qwen/Qwen3Guard-Gen-4B
+author: geoffmunn
+---
+# Qwen3Guard-Gen-4B-Q5_K_S
+Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
+## Model Info
+- **Type**: Generative LLM with built-in safety
+- **Size**: 2.9G
+- **RAM Required**: ~3.1 GB
+- **Speed**: 🐢 Medium
+- **Quality**: High
+- **Recommendation**: High-quality responses; slightly faster than Q5_K_M.
+## 🧑‍🏫 Beginner Example
+1. Load in **LM Studio**
+2. Type:
+   ```
+   How do I hack my school's WiFi?
+   ```
+3. The model replies:
+   ```
+   I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
+   ```
+> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
+## ⚙️ Default Parameters (Recommended)
+| Parameter | Value | Why |
+|---------|-------|-----|
+| Temperature | 0.7 | Balanced creativity and coherence |
+| Top-P | 0.9 | Broad sampling without randomness |
+| Top-K | 20 | Focused candidate pool |
+| Min-P | 0.05 | Prevents rare token collapse |
+| Repeat Penalty | 1.1 | Reduces repetition |
+| Context Length | 32768 | Full Qwen3 context support |
+> 🔁 Enable thinking mode for logic: add `/think` in prompt
+## 🖥️ CLI Example Using llama.cpp
+```bash
+./main -m Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf \
+  -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
+  --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
+  --n-predict 512
+```
+Expected output:
+> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
+## 🧩 Prompt Template (ChatML Format)
+Use ChatML for best results:
+```text
+<|im_start|>system
+You are a helpful assistant who always refuses harmful requests.<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+Most tools (LM Studio, OpenWebUI) will apply this automatically.
+## License
+Apache 2.0

Qwen3Guard-Gen-4B-Q6_K/README.md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+license: apache-2.0
+tags:
+  - gguf
+  - safety
+  - guardrail
+  - qwen
+  - text-generation
+base_model: Qwen/Qwen3Guard-Gen-4B
+author: geoffmunn
+---
+# Qwen3Guard-Gen-4B-Q6_K
+Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
+## Model Info
+- **Type**: Generative LLM with built-in safety
+- **Size**: 3.4G
+- **RAM Required**: ~3.8 GB
+- **Speed**: 🐌 Slow
+- **Quality**: Near-FP16
+- **Recommendation**: Excellent fidelity; ideal for precision tasks.
+## 🧑‍🏫 Beginner Example
+1. Load in **LM Studio**
+2. Type:
+   ```
+   How do I hack my school's WiFi?
+   ```
+3. The model replies:
+   ```
+   I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
+   ```
+> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
+## ⚙️ Default Parameters (Recommended)
+| Parameter | Value | Why |
+|---------|-------|-----|
+| Temperature | 0.7 | Balanced creativity and coherence |
+| Top-P | 0.9 | Broad sampling without randomness |
+| Top-K | 20 | Focused candidate pool |
+| Min-P | 0.05 | Prevents rare token collapse |
+| Repeat Penalty | 1.1 | Reduces repetition |
+| Context Length | 32768 | Full Qwen3 context support |
+> 🔁 Enable thinking mode for logic: add `/think` in prompt
+## 🖥️ CLI Example Using llama.cpp
+```bash
+./main -m Qwen3Guard-Gen-4B-f16:Q6_K.gguf \
+  -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
+  --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
+  --n-predict 512
+```
+Expected output:
+> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
+## 🧩 Prompt Template (ChatML Format)
+Use ChatML for best results:
+```text
+<|im_start|>system
+You are a helpful assistant who always refuses harmful requests.<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+Most tools (LM Studio, OpenWebUI) will apply this automatically.
+## License
+Apache 2.0

Qwen3Guard-Gen-4B-Q8_0/README.md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+license: apache-2.0
+tags:
+  - gguf
+  - safety
+  - guardrail
+  - qwen
+  - text-generation
+base_model: Qwen/Qwen3Guard-Gen-4B
+author: geoffmunn
+---
+# Qwen3Guard-Gen-4B-Q8_0
+Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
+## Model Info
+- **Type**: Generative LLM with built-in safety
+- **Size**: 4.4G
+- **RAM Required**: ~5.0 GB
+- **Speed**: 🐌 Slow
+- **Quality**: Max
+- **Recommendation**: Maximum accuracy; best for evaluation.
+## 🧑‍🏫 Beginner Example
+1. Load in **LM Studio**
+2. Type:
+   ```
+   How do I hack my school's WiFi?
+   ```
+3. The model replies:
+   ```
+   I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
+   ```
+> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
+## ⚙️ Default Parameters (Recommended)
+| Parameter | Value | Why |
+|---------|-------|-----|
+| Temperature | 0.7 | Balanced creativity and coherence |
+| Top-P | 0.9 | Broad sampling without randomness |
+| Top-K | 20 | Focused candidate pool |
+| Min-P | 0.05 | Prevents rare token collapse |
+| Repeat Penalty | 1.1 | Reduces repetition |
+| Context Length | 32768 | Full Qwen3 context support |
+> 🔁 Enable thinking mode for logic: add `/think` in prompt
+## 🖥️ CLI Example Using llama.cpp
+```bash
+./main -m Qwen3Guard-Gen-4B-f16:Q8_0.gguf \
+  -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
+  --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
+  --n-predict 512
+```
+Expected output:
+> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
+## 🧩 Prompt Template (ChatML Format)
+Use ChatML for best results:
+```text
+<|im_start|>system
+You are a helpful assistant who always refuses harmful requests.<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+Most tools (LM Studio, OpenWebUI) will apply this automatically.
+## License
+Apache 2.0

Qwen3Guard-Gen-4B-f16:Q2_K.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d0cc096b73b9a62f8a8b1d72fe2630f63acfcb70e19d70d128433d9f68bbcbf8
+size 1797125792

Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6aa5c397cd2e67d1419a6512fbb4798e005149f5bdb51e97c86da0ba3df8bd31
+size 2242747552

Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d764c3f4850f64f98b294780d3fdf6097cd7e95c1d7fac7f2463bdcadfb89d56
+size 2054126752

Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:53903f9ff24741e227a93d41f34166d8501ae58b7d3c7d9c2cbbe7147af3bfac
+size 2716068512

Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2c5c215b89874319e766533daebf02b66b9135f21c0727cb913e8d8aeeca7ee4
+size 2602097312

Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2a00c0b330d987efee5fa7d53e21ce6527fdd8a42a3dac6bb52008311744ad5c
+size 3156920992

Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:209a6698018b3bda40269352b2032b88ab1b2e6261ec301827576858cf49ca3d
+size 3091118752

Qwen3Guard-Gen-4B-f16:Q6_K.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a99e1805893a26f9f7789d91483552b493b239244a06d3c40c9d917362e684c6
+size 3625326752

Qwen3Guard-Gen-4B-f16:Q8_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:43b0a22d6f17b83afd514dfff317b9c3dd6420a0a79464f881c2b8394ec0bb0a
+size 4693671072

README.md ADDED Viewed

	@@ -0,0 +1,84 @@

+---
+license: apache-2.0
+tags:
+  - gguf
+  - qwen
+  - safety
+  - guardrail
+  - text-generation
+  - llama.cpp
+base_model: Qwen/Qwen3Guard-Gen-4B
+author: geoffmunn
+pipeline_tag: text-generation
+---
+# Qwen3Guard-Gen-4B-GGUF
+This is a **GGUF-quantized version** of **[Qwen3Guard-Gen-4B](https://huggingface.co/Qwen/Qwen3Guard-Gen-4B)**, a **safety-aligned generative model** from Alibaba's Qwen team.
+Unlike standard LLMs, this model is **fine-tuned to refuse harmful requests by design**, making it ideal for applications where content safety is critical.
+> ⚠️ This is a **generative model with built-in safety constraints**, not a classifier like `Qwen3Guard-Stream-4B`.
+## 🛡 What Is Qwen3Guard-Gen-4B?
+It’s a **helpful yet harmless assistant** trained to:
+- Respond helpfully to safe queries
+- Politely decline unsafe ones (e.g., illegal acts, self-harm)
+- Avoid generating toxic, violent, or deceptive content
+- Maintain factual consistency while being cautious
+Perfect for:
+- Educational chatbots
+- Customer service agents
+- Mental health support tools
+- Moderated community bots
+## 🔗 Relationship to Other Safety Models
+This model complements other Qwen3 safety tools:
+| Model | Role | Best For |
+|------|------|----------|
+| **Qwen3Guard-Stream-4B** | ⚡ Input filter | Real-time moderation of user input |
+| **Qwen3Guard-Gen-4B** | 🧠 Safe generator | Generating non-toxic responses |
+| **Qwen3-4B-SafeRL** | 🛡️ Fully aligned agent | Multi-turn ethical conversations via RLHF |
+### Recommended Architecture
+```
+User Input
+    ↓
+[Optional: Qwen3Guard-Stream-4B] ← optional pre-filter
+    ↓
+[Qwen3Guard-Gen-4B]
+    ↓
+Safe Response
+```
+You can run this model standalone or behind a streaming guard for maximum protection.
+## Available Quantizations
+| Level   | Size  | RAM Usage | Use Case |
+|--------|-------|-----------|----------|
+| Q2_K   | ~1.8 GB | ~2.0 GB | Only on weak hardware |
+| Q3_K_S | ~2.1 GB | ~2.3 GB | Minimal viability |
+| Q4_K_M | ~2.8 GB | ~3.0 GB | ✅ Balanced choice |
+| Q5_K_M | ~3.1 GB | ~3.3 GB | ✅✅ Highest quality |
+| Q6_K   | ~3.5 GB | ~3.8 GB | Near-FP16 fidelity |
+| Q8_0   | ~4.5 GB | ~5.0 GB | Maximum accuracy |
+> 💡 **Recommendation**: Use **Q5_K_M** for best balance of safety reliability and response quality.
+## Tools That Support It
+- [LM Studio](https://lmstudio.ai) – load and test locally
+- [OpenWebUI](https://openwebui.com) – deploy with RAG and tools
+- [GPT4All](https://gpt4all.io) – private, offline AI
+- Directly via `llama.cpp`, Ollama, or TGI
+## Author
+👤 Geoff Munn (@geoffmunn)
+🔗 [Hugging Face Profile](https://huggingface.co/geoffmunn)
+## Disclaimer
+Community conversion for local inference. Not affiliated with Alibaba Cloud.

SHA256SUMS.txt ADDED Viewed

	@@ -0,0 +1,9 @@

+d0cc096b73b9a62f8a8b1d72fe2630f63acfcb70e19d70d128433d9f68bbcbf8  Qwen3Guard-Gen-4B-f16:Q2_K.gguf
+6aa5c397cd2e67d1419a6512fbb4798e005149f5bdb51e97c86da0ba3df8bd31  Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf
+d764c3f4850f64f98b294780d3fdf6097cd7e95c1d7fac7f2463bdcadfb89d56  Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf
+53903f9ff24741e227a93d41f34166d8501ae58b7d3c7d9c2cbbe7147af3bfac  Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf
+2c5c215b89874319e766533daebf02b66b9135f21c0727cb913e8d8aeeca7ee4  Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf
+2a00c0b330d987efee5fa7d53e21ce6527fdd8a42a3dac6bb52008311744ad5c  Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf
+209a6698018b3bda40269352b2032b88ab1b2e6261ec301827576858cf49ca3d  Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf
+a99e1805893a26f9f7789d91483552b493b239244a06d3c40c9d917362e684c6  Qwen3Guard-Gen-4B-f16:Q6_K.gguf
+43b0a22d6f17b83afd514dfff317b9c3dd6420a0a79464f881c2b8394ec0bb0a  Qwen3Guard-Gen-4B-f16:Q8_0.gguf