Add Q2–Q8_0 quantized models with per-model cards, MODELFILE, CLI examples, and auto-upload
Browse files- .gitattributes +9 -0
- MODELFILE +25 -0
- Qwen3Guard-Gen-4B-Q2_K/README.md +80 -0
- Qwen3Guard-Gen-4B-Q3_K_M/README.md +80 -0
- Qwen3Guard-Gen-4B-Q3_K_S/README.md +80 -0
- Qwen3Guard-Gen-4B-Q4_K_M/README.md +80 -0
- Qwen3Guard-Gen-4B-Q4_K_S/README.md +80 -0
- Qwen3Guard-Gen-4B-Q5_K_M/README.md +80 -0
- Qwen3Guard-Gen-4B-Q5_K_S/README.md +80 -0
- Qwen3Guard-Gen-4B-Q6_K/README.md +80 -0
- Qwen3Guard-Gen-4B-Q8_0/README.md +80 -0
- Qwen3Guard-Gen-4B-f16:Q2_K.gguf +3 -0
- Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf +3 -0
- Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf +3 -0
- Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf +3 -0
- Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf +3 -0
- Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf +3 -0
- Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf +3 -0
- Qwen3Guard-Gen-4B-f16:Q6_K.gguf +3 -0
- Qwen3Guard-Gen-4B-f16:Q8_0.gguf +3 -0
- README.md +84 -0
- SHA256SUMS.txt +9 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,12 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
Qwen3Guard-Gen-4B-f16:Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
| 43 |
+
Qwen3Guard-Gen-4B-f16:Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
| 44 |
+
Qwen3Guard-Gen-4B-f16:Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
MODELFILE
ADDED
|
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# MODELFILE for Qwen3Guard-Gen-4B
|
| 2 |
+
# Used by LM Studio, OpenWebUI, GPT4All, etc.
|
| 3 |
+
|
| 4 |
+
context_length: 32768
|
| 5 |
+
embedding: false
|
| 6 |
+
f16: cpu
|
| 7 |
+
|
| 8 |
+
# Chat template using ChatML (used by Qwen)
|
| 9 |
+
prompt_template: >-
|
| 10 |
+
<|im_start|>system
|
| 11 |
+
You are a helpful assistant who always refuses harmful requests.<|im_end|>
|
| 12 |
+
<|im_start|>user
|
| 13 |
+
{prompt}<|im_end|>
|
| 14 |
+
<|im_start|>assistant
|
| 15 |
+
|
| 16 |
+
# Stop sequences help end generation cleanly
|
| 17 |
+
stop: "<|im_end|>"
|
| 18 |
+
stop: "<|im_start|>"
|
| 19 |
+
|
| 20 |
+
# Default sampling (optimized for safe generation)
|
| 21 |
+
temperature: 0.7
|
| 22 |
+
top_p: 0.9
|
| 23 |
+
top_k: 20
|
| 24 |
+
min_p: 0.05
|
| 25 |
+
repeat_penalty: 1.1
|
Qwen3Guard-Gen-4B-Q2_K/README.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- gguf
|
| 5 |
+
- safety
|
| 6 |
+
- guardrail
|
| 7 |
+
- qwen
|
| 8 |
+
- text-generation
|
| 9 |
+
base_model: Qwen/Qwen3Guard-Gen-4B
|
| 10 |
+
author: geoffmunn
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Qwen3Guard-Gen-4B-Q2_K
|
| 14 |
+
|
| 15 |
+
Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
|
| 16 |
+
|
| 17 |
+
## Model Info
|
| 18 |
+
- **Type**: Generative LLM with built-in safety
|
| 19 |
+
- **Size**: 1.7G
|
| 20 |
+
- **RAM Required**: ~2.0 GB
|
| 21 |
+
- **Speed**: ⚡ Fast
|
| 22 |
+
- **Quality**: Low
|
| 23 |
+
- **Recommendation**: Only for very weak devices; poor reasoning. Avoid.
|
| 24 |
+
|
| 25 |
+
## 🧑🏫 Beginner Example
|
| 26 |
+
|
| 27 |
+
1. Load in **LM Studio**
|
| 28 |
+
2. Type:
|
| 29 |
+
```
|
| 30 |
+
How do I hack my school's WiFi?
|
| 31 |
+
```
|
| 32 |
+
3. The model replies:
|
| 33 |
+
```
|
| 34 |
+
I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
|
| 38 |
+
|
| 39 |
+
## ⚙️ Default Parameters (Recommended)
|
| 40 |
+
|
| 41 |
+
| Parameter | Value | Why |
|
| 42 |
+
|---------|-------|-----|
|
| 43 |
+
| Temperature | 0.7 | Balanced creativity and coherence |
|
| 44 |
+
| Top-P | 0.9 | Broad sampling without randomness |
|
| 45 |
+
| Top-K | 20 | Focused candidate pool |
|
| 46 |
+
| Min-P | 0.05 | Prevents rare token collapse |
|
| 47 |
+
| Repeat Penalty | 1.1 | Reduces repetition |
|
| 48 |
+
| Context Length | 32768 | Full Qwen3 context support |
|
| 49 |
+
|
| 50 |
+
> 🔁 Enable thinking mode for logic: add `/think` in prompt
|
| 51 |
+
|
| 52 |
+
## 🖥️ CLI Example Using llama.cpp
|
| 53 |
+
|
| 54 |
+
```bash
|
| 55 |
+
./main -m Qwen3Guard-Gen-4B-f16:Q2_K.gguf \
|
| 56 |
+
-p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
|
| 57 |
+
--temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
|
| 58 |
+
--n-predict 512
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
Expected output:
|
| 62 |
+
> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
|
| 63 |
+
|
| 64 |
+
## 🧩 Prompt Template (ChatML Format)
|
| 65 |
+
|
| 66 |
+
Use ChatML for best results:
|
| 67 |
+
|
| 68 |
+
```text
|
| 69 |
+
<|im_start|>system
|
| 70 |
+
You are a helpful assistant who always refuses harmful requests.<|im_end|>
|
| 71 |
+
<|im_start|>user
|
| 72 |
+
{prompt}<|im_end|>
|
| 73 |
+
<|im_start|>assistant
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
Most tools (LM Studio, OpenWebUI) will apply this automatically.
|
| 77 |
+
|
| 78 |
+
## License
|
| 79 |
+
|
| 80 |
+
Apache 2.0
|
Qwen3Guard-Gen-4B-Q3_K_M/README.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- gguf
|
| 5 |
+
- safety
|
| 6 |
+
- guardrail
|
| 7 |
+
- qwen
|
| 8 |
+
- text-generation
|
| 9 |
+
base_model: Qwen/Qwen3Guard-Gen-4B
|
| 10 |
+
author: geoffmunn
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Qwen3Guard-Gen-4B-Q3_K_M
|
| 14 |
+
|
| 15 |
+
Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
|
| 16 |
+
|
| 17 |
+
## Model Info
|
| 18 |
+
- **Type**: Generative LLM with built-in safety
|
| 19 |
+
- **Size**: 2.1G
|
| 20 |
+
- **RAM Required**: ~2.5 GB
|
| 21 |
+
- **Speed**: ⚡ Fast
|
| 22 |
+
- **Quality**: Low-Med
|
| 23 |
+
- **Recommendation**: Basic generation; acceptable for simple tasks.
|
| 24 |
+
|
| 25 |
+
## 🧑🏫 Beginner Example
|
| 26 |
+
|
| 27 |
+
1. Load in **LM Studio**
|
| 28 |
+
2. Type:
|
| 29 |
+
```
|
| 30 |
+
How do I hack my school's WiFi?
|
| 31 |
+
```
|
| 32 |
+
3. The model replies:
|
| 33 |
+
```
|
| 34 |
+
I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
|
| 38 |
+
|
| 39 |
+
## ⚙️ Default Parameters (Recommended)
|
| 40 |
+
|
| 41 |
+
| Parameter | Value | Why |
|
| 42 |
+
|---------|-------|-----|
|
| 43 |
+
| Temperature | 0.7 | Balanced creativity and coherence |
|
| 44 |
+
| Top-P | 0.9 | Broad sampling without randomness |
|
| 45 |
+
| Top-K | 20 | Focused candidate pool |
|
| 46 |
+
| Min-P | 0.05 | Prevents rare token collapse |
|
| 47 |
+
| Repeat Penalty | 1.1 | Reduces repetition |
|
| 48 |
+
| Context Length | 32768 | Full Qwen3 context support |
|
| 49 |
+
|
| 50 |
+
> 🔁 Enable thinking mode for logic: add `/think` in prompt
|
| 51 |
+
|
| 52 |
+
## 🖥️ CLI Example Using llama.cpp
|
| 53 |
+
|
| 54 |
+
```bash
|
| 55 |
+
./main -m Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf \
|
| 56 |
+
-p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
|
| 57 |
+
--temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
|
| 58 |
+
--n-predict 512
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
Expected output:
|
| 62 |
+
> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
|
| 63 |
+
|
| 64 |
+
## 🧩 Prompt Template (ChatML Format)
|
| 65 |
+
|
| 66 |
+
Use ChatML for best results:
|
| 67 |
+
|
| 68 |
+
```text
|
| 69 |
+
<|im_start|>system
|
| 70 |
+
You are a helpful assistant who always refuses harmful requests.<|im_end|>
|
| 71 |
+
<|im_start|>user
|
| 72 |
+
{prompt}<|im_end|>
|
| 73 |
+
<|im_start|>assistant
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
Most tools (LM Studio, OpenWebUI) will apply this automatically.
|
| 77 |
+
|
| 78 |
+
## License
|
| 79 |
+
|
| 80 |
+
Apache 2.0
|
Qwen3Guard-Gen-4B-Q3_K_S/README.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- gguf
|
| 5 |
+
- safety
|
| 6 |
+
- guardrail
|
| 7 |
+
- qwen
|
| 8 |
+
- text-generation
|
| 9 |
+
base_model: Qwen/Qwen3Guard-Gen-4B
|
| 10 |
+
author: geoffmunn
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Qwen3Guard-Gen-4B-Q3_K_S
|
| 14 |
+
|
| 15 |
+
Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
|
| 16 |
+
|
| 17 |
+
## Model Info
|
| 18 |
+
- **Type**: Generative LLM with built-in safety
|
| 19 |
+
- **Size**: 2.0G
|
| 20 |
+
- **RAM Required**: ~2.3 GB
|
| 21 |
+
- **Speed**: ⚡ Fast
|
| 22 |
+
- **Quality**: Low
|
| 23 |
+
- **Recommendation**: Minimal quality; may miss subtle risks.
|
| 24 |
+
|
| 25 |
+
## 🧑🏫 Beginner Example
|
| 26 |
+
|
| 27 |
+
1. Load in **LM Studio**
|
| 28 |
+
2. Type:
|
| 29 |
+
```
|
| 30 |
+
How do I hack my school's WiFi?
|
| 31 |
+
```
|
| 32 |
+
3. The model replies:
|
| 33 |
+
```
|
| 34 |
+
I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
|
| 38 |
+
|
| 39 |
+
## ⚙️ Default Parameters (Recommended)
|
| 40 |
+
|
| 41 |
+
| Parameter | Value | Why |
|
| 42 |
+
|---------|-------|-----|
|
| 43 |
+
| Temperature | 0.7 | Balanced creativity and coherence |
|
| 44 |
+
| Top-P | 0.9 | Broad sampling without randomness |
|
| 45 |
+
| Top-K | 20 | Focused candidate pool |
|
| 46 |
+
| Min-P | 0.05 | Prevents rare token collapse |
|
| 47 |
+
| Repeat Penalty | 1.1 | Reduces repetition |
|
| 48 |
+
| Context Length | 32768 | Full Qwen3 context support |
|
| 49 |
+
|
| 50 |
+
> 🔁 Enable thinking mode for logic: add `/think` in prompt
|
| 51 |
+
|
| 52 |
+
## 🖥️ CLI Example Using llama.cpp
|
| 53 |
+
|
| 54 |
+
```bash
|
| 55 |
+
./main -m Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf \
|
| 56 |
+
-p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
|
| 57 |
+
--temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
|
| 58 |
+
--n-predict 512
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
Expected output:
|
| 62 |
+
> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
|
| 63 |
+
|
| 64 |
+
## 🧩 Prompt Template (ChatML Format)
|
| 65 |
+
|
| 66 |
+
Use ChatML for best results:
|
| 67 |
+
|
| 68 |
+
```text
|
| 69 |
+
<|im_start|>system
|
| 70 |
+
You are a helpful assistant who always refuses harmful requests.<|im_end|>
|
| 71 |
+
<|im_start|>user
|
| 72 |
+
{prompt}<|im_end|>
|
| 73 |
+
<|im_start|>assistant
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
Most tools (LM Studio, OpenWebUI) will apply this automatically.
|
| 77 |
+
|
| 78 |
+
## License
|
| 79 |
+
|
| 80 |
+
Apache 2.0
|
Qwen3Guard-Gen-4B-Q4_K_M/README.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- gguf
|
| 5 |
+
- safety
|
| 6 |
+
- guardrail
|
| 7 |
+
- qwen
|
| 8 |
+
- text-generation
|
| 9 |
+
base_model: Qwen/Qwen3Guard-Gen-4B
|
| 10 |
+
author: geoffmunn
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Qwen3Guard-Gen-4B-Q4_K_M
|
| 14 |
+
|
| 15 |
+
Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
|
| 16 |
+
|
| 17 |
+
## Model Info
|
| 18 |
+
- **Type**: Generative LLM with built-in safety
|
| 19 |
+
- **Size**: 2.6G
|
| 20 |
+
- **RAM Required**: ~3.0 GB
|
| 21 |
+
- **Speed**: 🚀 Fast
|
| 22 |
+
- **Quality**: Balanced
|
| 23 |
+
- **Recommendation**: ✅ Best speed/quality balance.
|
| 24 |
+
|
| 25 |
+
## 🧑🏫 Beginner Example
|
| 26 |
+
|
| 27 |
+
1. Load in **LM Studio**
|
| 28 |
+
2. Type:
|
| 29 |
+
```
|
| 30 |
+
How do I hack my school's WiFi?
|
| 31 |
+
```
|
| 32 |
+
3. The model replies:
|
| 33 |
+
```
|
| 34 |
+
I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
|
| 38 |
+
|
| 39 |
+
## ⚙️ Default Parameters (Recommended)
|
| 40 |
+
|
| 41 |
+
| Parameter | Value | Why |
|
| 42 |
+
|---------|-------|-----|
|
| 43 |
+
| Temperature | 0.7 | Balanced creativity and coherence |
|
| 44 |
+
| Top-P | 0.9 | Broad sampling without randomness |
|
| 45 |
+
| Top-K | 20 | Focused candidate pool |
|
| 46 |
+
| Min-P | 0.05 | Prevents rare token collapse |
|
| 47 |
+
| Repeat Penalty | 1.1 | Reduces repetition |
|
| 48 |
+
| Context Length | 32768 | Full Qwen3 context support |
|
| 49 |
+
|
| 50 |
+
> 🔁 Enable thinking mode for logic: add `/think` in prompt
|
| 51 |
+
|
| 52 |
+
## 🖥️ CLI Example Using llama.cpp
|
| 53 |
+
|
| 54 |
+
```bash
|
| 55 |
+
./main -m Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf \
|
| 56 |
+
-p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
|
| 57 |
+
--temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
|
| 58 |
+
--n-predict 512
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
Expected output:
|
| 62 |
+
> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
|
| 63 |
+
|
| 64 |
+
## 🧩 Prompt Template (ChatML Format)
|
| 65 |
+
|
| 66 |
+
Use ChatML for best results:
|
| 67 |
+
|
| 68 |
+
```text
|
| 69 |
+
<|im_start|>system
|
| 70 |
+
You are a helpful assistant who always refuses harmful requests.<|im_end|>
|
| 71 |
+
<|im_start|>user
|
| 72 |
+
{prompt}<|im_end|>
|
| 73 |
+
<|im_start|>assistant
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
Most tools (LM Studio, OpenWebUI) will apply this automatically.
|
| 77 |
+
|
| 78 |
+
## License
|
| 79 |
+
|
| 80 |
+
Apache 2.0
|
Qwen3Guard-Gen-4B-Q4_K_S/README.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- gguf
|
| 5 |
+
- safety
|
| 6 |
+
- guardrail
|
| 7 |
+
- qwen
|
| 8 |
+
- text-generation
|
| 9 |
+
base_model: Qwen/Qwen3Guard-Gen-4B
|
| 10 |
+
author: geoffmunn
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Qwen3Guard-Gen-4B-Q4_K_S
|
| 14 |
+
|
| 15 |
+
Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
|
| 16 |
+
|
| 17 |
+
## Model Info
|
| 18 |
+
- **Type**: Generative LLM with built-in safety
|
| 19 |
+
- **Size**: 2.5G
|
| 20 |
+
- **RAM Required**: ~2.7 GB
|
| 21 |
+
- **Speed**: 🚀 Fast
|
| 22 |
+
- **Quality**: Medium
|
| 23 |
+
- **Recommendation**: Good for edge devices; decent output.
|
| 24 |
+
|
| 25 |
+
## 🧑🏫 Beginner Example
|
| 26 |
+
|
| 27 |
+
1. Load in **LM Studio**
|
| 28 |
+
2. Type:
|
| 29 |
+
```
|
| 30 |
+
How do I hack my school's WiFi?
|
| 31 |
+
```
|
| 32 |
+
3. The model replies:
|
| 33 |
+
```
|
| 34 |
+
I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
|
| 38 |
+
|
| 39 |
+
## ⚙️ Default Parameters (Recommended)
|
| 40 |
+
|
| 41 |
+
| Parameter | Value | Why |
|
| 42 |
+
|---------|-------|-----|
|
| 43 |
+
| Temperature | 0.7 | Balanced creativity and coherence |
|
| 44 |
+
| Top-P | 0.9 | Broad sampling without randomness |
|
| 45 |
+
| Top-K | 20 | Focused candidate pool |
|
| 46 |
+
| Min-P | 0.05 | Prevents rare token collapse |
|
| 47 |
+
| Repeat Penalty | 1.1 | Reduces repetition |
|
| 48 |
+
| Context Length | 32768 | Full Qwen3 context support |
|
| 49 |
+
|
| 50 |
+
> 🔁 Enable thinking mode for logic: add `/think` in prompt
|
| 51 |
+
|
| 52 |
+
## 🖥️ CLI Example Using llama.cpp
|
| 53 |
+
|
| 54 |
+
```bash
|
| 55 |
+
./main -m Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf \
|
| 56 |
+
-p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
|
| 57 |
+
--temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
|
| 58 |
+
--n-predict 512
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
Expected output:
|
| 62 |
+
> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
|
| 63 |
+
|
| 64 |
+
## 🧩 Prompt Template (ChatML Format)
|
| 65 |
+
|
| 66 |
+
Use ChatML for best results:
|
| 67 |
+
|
| 68 |
+
```text
|
| 69 |
+
<|im_start|>system
|
| 70 |
+
You are a helpful assistant who always refuses harmful requests.<|im_end|>
|
| 71 |
+
<|im_start|>user
|
| 72 |
+
{prompt}<|im_end|>
|
| 73 |
+
<|im_start|>assistant
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
Most tools (LM Studio, OpenWebUI) will apply this automatically.
|
| 77 |
+
|
| 78 |
+
## License
|
| 79 |
+
|
| 80 |
+
Apache 2.0
|
Qwen3Guard-Gen-4B-Q5_K_M/README.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- gguf
|
| 5 |
+
- safety
|
| 6 |
+
- guardrail
|
| 7 |
+
- qwen
|
| 8 |
+
- text-generation
|
| 9 |
+
base_model: Qwen/Qwen3Guard-Gen-4B
|
| 10 |
+
author: geoffmunn
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Qwen3Guard-Gen-4B-Q5_K_M
|
| 14 |
+
|
| 15 |
+
Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
|
| 16 |
+
|
| 17 |
+
## Model Info
|
| 18 |
+
- **Type**: Generative LLM with built-in safety
|
| 19 |
+
- **Size**: 3.0G
|
| 20 |
+
- **RAM Required**: ~3.3 GB
|
| 21 |
+
- **Speed**: 🐢 Medium
|
| 22 |
+
- **Quality**: High+
|
| 23 |
+
- **Recommendation**: ✅✅ Top choice for production safety apps.
|
| 24 |
+
|
| 25 |
+
## 🧑🏫 Beginner Example
|
| 26 |
+
|
| 27 |
+
1. Load in **LM Studio**
|
| 28 |
+
2. Type:
|
| 29 |
+
```
|
| 30 |
+
How do I hack my school's WiFi?
|
| 31 |
+
```
|
| 32 |
+
3. The model replies:
|
| 33 |
+
```
|
| 34 |
+
I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
|
| 38 |
+
|
| 39 |
+
## ⚙️ Default Parameters (Recommended)
|
| 40 |
+
|
| 41 |
+
| Parameter | Value | Why |
|
| 42 |
+
|---------|-------|-----|
|
| 43 |
+
| Temperature | 0.7 | Balanced creativity and coherence |
|
| 44 |
+
| Top-P | 0.9 | Broad sampling without randomness |
|
| 45 |
+
| Top-K | 20 | Focused candidate pool |
|
| 46 |
+
| Min-P | 0.05 | Prevents rare token collapse |
|
| 47 |
+
| Repeat Penalty | 1.1 | Reduces repetition |
|
| 48 |
+
| Context Length | 32768 | Full Qwen3 context support |
|
| 49 |
+
|
| 50 |
+
> 🔁 Enable thinking mode for logic: add `/think` in prompt
|
| 51 |
+
|
| 52 |
+
## 🖥️ CLI Example Using llama.cpp
|
| 53 |
+
|
| 54 |
+
```bash
|
| 55 |
+
./main -m Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf \
|
| 56 |
+
-p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
|
| 57 |
+
--temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
|
| 58 |
+
--n-predict 512
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
Expected output:
|
| 62 |
+
> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
|
| 63 |
+
|
| 64 |
+
## 🧩 Prompt Template (ChatML Format)
|
| 65 |
+
|
| 66 |
+
Use ChatML for best results:
|
| 67 |
+
|
| 68 |
+
```text
|
| 69 |
+
<|im_start|>system
|
| 70 |
+
You are a helpful assistant who always refuses harmful requests.<|im_end|>
|
| 71 |
+
<|im_start|>user
|
| 72 |
+
{prompt}<|im_end|>
|
| 73 |
+
<|im_start|>assistant
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
Most tools (LM Studio, OpenWebUI) will apply this automatically.
|
| 77 |
+
|
| 78 |
+
## License
|
| 79 |
+
|
| 80 |
+
Apache 2.0
|
Qwen3Guard-Gen-4B-Q5_K_S/README.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- gguf
|
| 5 |
+
- safety
|
| 6 |
+
- guardrail
|
| 7 |
+
- qwen
|
| 8 |
+
- text-generation
|
| 9 |
+
base_model: Qwen/Qwen3Guard-Gen-4B
|
| 10 |
+
author: geoffmunn
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Qwen3Guard-Gen-4B-Q5_K_S
|
| 14 |
+
|
| 15 |
+
Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
|
| 16 |
+
|
| 17 |
+
## Model Info
|
| 18 |
+
- **Type**: Generative LLM with built-in safety
|
| 19 |
+
- **Size**: 2.9G
|
| 20 |
+
- **RAM Required**: ~3.1 GB
|
| 21 |
+
- **Speed**: 🐢 Medium
|
| 22 |
+
- **Quality**: High
|
| 23 |
+
- **Recommendation**: High-quality responses; slightly faster than Q5_K_M.
|
| 24 |
+
|
| 25 |
+
## 🧑🏫 Beginner Example
|
| 26 |
+
|
| 27 |
+
1. Load in **LM Studio**
|
| 28 |
+
2. Type:
|
| 29 |
+
```
|
| 30 |
+
How do I hack my school's WiFi?
|
| 31 |
+
```
|
| 32 |
+
3. The model replies:
|
| 33 |
+
```
|
| 34 |
+
I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
|
| 38 |
+
|
| 39 |
+
## ⚙️ Default Parameters (Recommended)
|
| 40 |
+
|
| 41 |
+
| Parameter | Value | Why |
|
| 42 |
+
|---------|-------|-----|
|
| 43 |
+
| Temperature | 0.7 | Balanced creativity and coherence |
|
| 44 |
+
| Top-P | 0.9 | Broad sampling without randomness |
|
| 45 |
+
| Top-K | 20 | Focused candidate pool |
|
| 46 |
+
| Min-P | 0.05 | Prevents rare token collapse |
|
| 47 |
+
| Repeat Penalty | 1.1 | Reduces repetition |
|
| 48 |
+
| Context Length | 32768 | Full Qwen3 context support |
|
| 49 |
+
|
| 50 |
+
> 🔁 Enable thinking mode for logic: add `/think` in prompt
|
| 51 |
+
|
| 52 |
+
## 🖥️ CLI Example Using llama.cpp
|
| 53 |
+
|
| 54 |
+
```bash
|
| 55 |
+
./main -m Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf \
|
| 56 |
+
-p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
|
| 57 |
+
--temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
|
| 58 |
+
--n-predict 512
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
Expected output:
|
| 62 |
+
> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
|
| 63 |
+
|
| 64 |
+
## 🧩 Prompt Template (ChatML Format)
|
| 65 |
+
|
| 66 |
+
Use ChatML for best results:
|
| 67 |
+
|
| 68 |
+
```text
|
| 69 |
+
<|im_start|>system
|
| 70 |
+
You are a helpful assistant who always refuses harmful requests.<|im_end|>
|
| 71 |
+
<|im_start|>user
|
| 72 |
+
{prompt}<|im_end|>
|
| 73 |
+
<|im_start|>assistant
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
Most tools (LM Studio, OpenWebUI) will apply this automatically.
|
| 77 |
+
|
| 78 |
+
## License
|
| 79 |
+
|
| 80 |
+
Apache 2.0
|
Qwen3Guard-Gen-4B-Q6_K/README.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- gguf
|
| 5 |
+
- safety
|
| 6 |
+
- guardrail
|
| 7 |
+
- qwen
|
| 8 |
+
- text-generation
|
| 9 |
+
base_model: Qwen/Qwen3Guard-Gen-4B
|
| 10 |
+
author: geoffmunn
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Qwen3Guard-Gen-4B-Q6_K
|
| 14 |
+
|
| 15 |
+
Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
|
| 16 |
+
|
| 17 |
+
## Model Info
|
| 18 |
+
- **Type**: Generative LLM with built-in safety
|
| 19 |
+
- **Size**: 3.4G
|
| 20 |
+
- **RAM Required**: ~3.8 GB
|
| 21 |
+
- **Speed**: 🐌 Slow
|
| 22 |
+
- **Quality**: Near-FP16
|
| 23 |
+
- **Recommendation**: Excellent fidelity; ideal for precision tasks.
|
| 24 |
+
|
| 25 |
+
## 🧑🏫 Beginner Example
|
| 26 |
+
|
| 27 |
+
1. Load in **LM Studio**
|
| 28 |
+
2. Type:
|
| 29 |
+
```
|
| 30 |
+
How do I hack my school's WiFi?
|
| 31 |
+
```
|
| 32 |
+
3. The model replies:
|
| 33 |
+
```
|
| 34 |
+
I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
|
| 38 |
+
|
| 39 |
+
## ⚙️ Default Parameters (Recommended)
|
| 40 |
+
|
| 41 |
+
| Parameter | Value | Why |
|
| 42 |
+
|---------|-------|-----|
|
| 43 |
+
| Temperature | 0.7 | Balanced creativity and coherence |
|
| 44 |
+
| Top-P | 0.9 | Broad sampling without randomness |
|
| 45 |
+
| Top-K | 20 | Focused candidate pool |
|
| 46 |
+
| Min-P | 0.05 | Prevents rare token collapse |
|
| 47 |
+
| Repeat Penalty | 1.1 | Reduces repetition |
|
| 48 |
+
| Context Length | 32768 | Full Qwen3 context support |
|
| 49 |
+
|
| 50 |
+
> 🔁 Enable thinking mode for logic: add `/think` in prompt
|
| 51 |
+
|
| 52 |
+
## 🖥️ CLI Example Using llama.cpp
|
| 53 |
+
|
| 54 |
+
```bash
|
| 55 |
+
./main -m Qwen3Guard-Gen-4B-f16:Q6_K.gguf \
|
| 56 |
+
-p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
|
| 57 |
+
--temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
|
| 58 |
+
--n-predict 512
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
Expected output:
|
| 62 |
+
> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
|
| 63 |
+
|
| 64 |
+
## 🧩 Prompt Template (ChatML Format)
|
| 65 |
+
|
| 66 |
+
Use ChatML for best results:
|
| 67 |
+
|
| 68 |
+
```text
|
| 69 |
+
<|im_start|>system
|
| 70 |
+
You are a helpful assistant who always refuses harmful requests.<|im_end|>
|
| 71 |
+
<|im_start|>user
|
| 72 |
+
{prompt}<|im_end|>
|
| 73 |
+
<|im_start|>assistant
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
Most tools (LM Studio, OpenWebUI) will apply this automatically.
|
| 77 |
+
|
| 78 |
+
## License
|
| 79 |
+
|
| 80 |
+
Apache 2.0
|
Qwen3Guard-Gen-4B-Q8_0/README.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- gguf
|
| 5 |
+
- safety
|
| 6 |
+
- guardrail
|
| 7 |
+
- qwen
|
| 8 |
+
- text-generation
|
| 9 |
+
base_model: Qwen/Qwen3Guard-Gen-4B
|
| 10 |
+
author: geoffmunn
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Qwen3Guard-Gen-4B-Q8_0
|
| 14 |
+
|
| 15 |
+
Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
|
| 16 |
+
|
| 17 |
+
## Model Info
|
| 18 |
+
- **Type**: Generative LLM with built-in safety
|
| 19 |
+
- **Size**: 4.4G
|
| 20 |
+
- **RAM Required**: ~5.0 GB
|
| 21 |
+
- **Speed**: 🐌 Slow
|
| 22 |
+
- **Quality**: Max
|
| 23 |
+
- **Recommendation**: Maximum accuracy; best for evaluation.
|
| 24 |
+
|
| 25 |
+
## 🧑🏫 Beginner Example
|
| 26 |
+
|
| 27 |
+
1. Load in **LM Studio**
|
| 28 |
+
2. Type:
|
| 29 |
+
```
|
| 30 |
+
How do I hack my school's WiFi?
|
| 31 |
+
```
|
| 32 |
+
3. The model replies:
|
| 33 |
+
```
|
| 34 |
+
I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
|
| 38 |
+
|
| 39 |
+
## ⚙️ Default Parameters (Recommended)
|
| 40 |
+
|
| 41 |
+
| Parameter | Value | Why |
|
| 42 |
+
|---------|-------|-----|
|
| 43 |
+
| Temperature | 0.7 | Balanced creativity and coherence |
|
| 44 |
+
| Top-P | 0.9 | Broad sampling without randomness |
|
| 45 |
+
| Top-K | 20 | Focused candidate pool |
|
| 46 |
+
| Min-P | 0.05 | Prevents rare token collapse |
|
| 47 |
+
| Repeat Penalty | 1.1 | Reduces repetition |
|
| 48 |
+
| Context Length | 32768 | Full Qwen3 context support |
|
| 49 |
+
|
| 50 |
+
> 🔁 Enable thinking mode for logic: add `/think` in prompt
|
| 51 |
+
|
| 52 |
+
## 🖥️ CLI Example Using llama.cpp
|
| 53 |
+
|
| 54 |
+
```bash
|
| 55 |
+
./main -m Qwen3Guard-Gen-4B-f16:Q8_0.gguf \
|
| 56 |
+
-p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
|
| 57 |
+
--temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
|
| 58 |
+
--n-predict 512
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
Expected output:
|
| 62 |
+
> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
|
| 63 |
+
|
| 64 |
+
## 🧩 Prompt Template (ChatML Format)
|
| 65 |
+
|
| 66 |
+
Use ChatML for best results:
|
| 67 |
+
|
| 68 |
+
```text
|
| 69 |
+
<|im_start|>system
|
| 70 |
+
You are a helpful assistant who always refuses harmful requests.<|im_end|>
|
| 71 |
+
<|im_start|>user
|
| 72 |
+
{prompt}<|im_end|>
|
| 73 |
+
<|im_start|>assistant
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
Most tools (LM Studio, OpenWebUI) will apply this automatically.
|
| 77 |
+
|
| 78 |
+
## License
|
| 79 |
+
|
| 80 |
+
Apache 2.0
|
Qwen3Guard-Gen-4B-f16:Q2_K.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d0cc096b73b9a62f8a8b1d72fe2630f63acfcb70e19d70d128433d9f68bbcbf8
|
| 3 |
+
size 1797125792
|
Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6aa5c397cd2e67d1419a6512fbb4798e005149f5bdb51e97c86da0ba3df8bd31
|
| 3 |
+
size 2242747552
|
Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d764c3f4850f64f98b294780d3fdf6097cd7e95c1d7fac7f2463bdcadfb89d56
|
| 3 |
+
size 2054126752
|
Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:53903f9ff24741e227a93d41f34166d8501ae58b7d3c7d9c2cbbe7147af3bfac
|
| 3 |
+
size 2716068512
|
Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2c5c215b89874319e766533daebf02b66b9135f21c0727cb913e8d8aeeca7ee4
|
| 3 |
+
size 2602097312
|
Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2a00c0b330d987efee5fa7d53e21ce6527fdd8a42a3dac6bb52008311744ad5c
|
| 3 |
+
size 3156920992
|
Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:209a6698018b3bda40269352b2032b88ab1b2e6261ec301827576858cf49ca3d
|
| 3 |
+
size 3091118752
|
Qwen3Guard-Gen-4B-f16:Q6_K.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a99e1805893a26f9f7789d91483552b493b239244a06d3c40c9d917362e684c6
|
| 3 |
+
size 3625326752
|
Qwen3Guard-Gen-4B-f16:Q8_0.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:43b0a22d6f17b83afd514dfff317b9c3dd6420a0a79464f881c2b8394ec0bb0a
|
| 3 |
+
size 4693671072
|
README.md
ADDED
|
@@ -0,0 +1,84 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- gguf
|
| 5 |
+
- qwen
|
| 6 |
+
- safety
|
| 7 |
+
- guardrail
|
| 8 |
+
- text-generation
|
| 9 |
+
- llama.cpp
|
| 10 |
+
base_model: Qwen/Qwen3Guard-Gen-4B
|
| 11 |
+
author: geoffmunn
|
| 12 |
+
pipeline_tag: text-generation
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# Qwen3Guard-Gen-4B-GGUF
|
| 16 |
+
|
| 17 |
+
This is a **GGUF-quantized version** of **[Qwen3Guard-Gen-4B](https://huggingface.co/Qwen/Qwen3Guard-Gen-4B)**, a **safety-aligned generative model** from Alibaba's Qwen team.
|
| 18 |
+
|
| 19 |
+
Unlike standard LLMs, this model is **fine-tuned to refuse harmful requests by design**, making it ideal for applications where content safety is critical.
|
| 20 |
+
|
| 21 |
+
> ⚠️ This is a **generative model with built-in safety constraints**, not a classifier like `Qwen3Guard-Stream-4B`.
|
| 22 |
+
|
| 23 |
+
## 🛡 What Is Qwen3Guard-Gen-4B?
|
| 24 |
+
|
| 25 |
+
It’s a **helpful yet harmless assistant** trained to:
|
| 26 |
+
- Respond helpfully to safe queries
|
| 27 |
+
- Politely decline unsafe ones (e.g., illegal acts, self-harm)
|
| 28 |
+
- Avoid generating toxic, violent, or deceptive content
|
| 29 |
+
- Maintain factual consistency while being cautious
|
| 30 |
+
|
| 31 |
+
Perfect for:
|
| 32 |
+
- Educational chatbots
|
| 33 |
+
- Customer service agents
|
| 34 |
+
- Mental health support tools
|
| 35 |
+
- Moderated community bots
|
| 36 |
+
|
| 37 |
+
## 🔗 Relationship to Other Safety Models
|
| 38 |
+
|
| 39 |
+
This model complements other Qwen3 safety tools:
|
| 40 |
+
|
| 41 |
+
| Model | Role | Best For |
|
| 42 |
+
|------|------|----------|
|
| 43 |
+
| **Qwen3Guard-Stream-4B** | ⚡ Input filter | Real-time moderation of user input |
|
| 44 |
+
| **Qwen3Guard-Gen-4B** | 🧠 Safe generator | Generating non-toxic responses |
|
| 45 |
+
| **Qwen3-4B-SafeRL** | 🛡️ Fully aligned agent | Multi-turn ethical conversations via RLHF |
|
| 46 |
+
|
| 47 |
+
### Recommended Architecture
|
| 48 |
+
```
|
| 49 |
+
User Input
|
| 50 |
+
↓
|
| 51 |
+
[Optional: Qwen3Guard-Stream-4B] ← optional pre-filter
|
| 52 |
+
↓
|
| 53 |
+
[Qwen3Guard-Gen-4B]
|
| 54 |
+
↓
|
| 55 |
+
Safe Response
|
| 56 |
+
```
|
| 57 |
+
|
| 58 |
+
You can run this model standalone or behind a streaming guard for maximum protection.
|
| 59 |
+
|
| 60 |
+
## Available Quantizations
|
| 61 |
+
|
| 62 |
+
| Level | Size | RAM Usage | Use Case |
|
| 63 |
+
|--------|-------|-----------|----------|
|
| 64 |
+
| Q2_K | ~1.8 GB | ~2.0 GB | Only on weak hardware |
|
| 65 |
+
| Q3_K_S | ~2.1 GB | ~2.3 GB | Minimal viability |
|
| 66 |
+
| Q4_K_M | ~2.8 GB | ~3.0 GB | ✅ Balanced choice |
|
| 67 |
+
| Q5_K_M | ~3.1 GB | ~3.3 GB | ✅✅ Highest quality |
|
| 68 |
+
| Q6_K | ~3.5 GB | ~3.8 GB | Near-FP16 fidelity |
|
| 69 |
+
| Q8_0 | ~4.5 GB | ~5.0 GB | Maximum accuracy |
|
| 70 |
+
|
| 71 |
+
> 💡 **Recommendation**: Use **Q5_K_M** for best balance of safety reliability and response quality.
|
| 72 |
+
|
| 73 |
+
## Tools That Support It
|
| 74 |
+
- [LM Studio](https://lmstudio.ai) – load and test locally
|
| 75 |
+
- [OpenWebUI](https://openwebui.com) – deploy with RAG and tools
|
| 76 |
+
- [GPT4All](https://gpt4all.io) – private, offline AI
|
| 77 |
+
- Directly via `llama.cpp`, Ollama, or TGI
|
| 78 |
+
|
| 79 |
+
## Author
|
| 80 |
+
👤 Geoff Munn (@geoffmunn)
|
| 81 |
+
🔗 [Hugging Face Profile](https://huggingface.co/geoffmunn)
|
| 82 |
+
|
| 83 |
+
## Disclaimer
|
| 84 |
+
Community conversion for local inference. Not affiliated with Alibaba Cloud.
|
SHA256SUMS.txt
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
d0cc096b73b9a62f8a8b1d72fe2630f63acfcb70e19d70d128433d9f68bbcbf8 Qwen3Guard-Gen-4B-f16:Q2_K.gguf
|
| 2 |
+
6aa5c397cd2e67d1419a6512fbb4798e005149f5bdb51e97c86da0ba3df8bd31 Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf
|
| 3 |
+
d764c3f4850f64f98b294780d3fdf6097cd7e95c1d7fac7f2463bdcadfb89d56 Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf
|
| 4 |
+
53903f9ff24741e227a93d41f34166d8501ae58b7d3c7d9c2cbbe7147af3bfac Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf
|
| 5 |
+
2c5c215b89874319e766533daebf02b66b9135f21c0727cb913e8d8aeeca7ee4 Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf
|
| 6 |
+
2a00c0b330d987efee5fa7d53e21ce6527fdd8a42a3dac6bb52008311744ad5c Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf
|
| 7 |
+
209a6698018b3bda40269352b2032b88ab1b2e6261ec301827576858cf49ca3d Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf
|
| 8 |
+
a99e1805893a26f9f7789d91483552b493b239244a06d3c40c9d917362e684c6 Qwen3Guard-Gen-4B-f16:Q6_K.gguf
|
| 9 |
+
43b0a22d6f17b83afd514dfff317b9c3dd6420a0a79464f881c2b8394ec0bb0a Qwen3Guard-Gen-4B-f16:Q8_0.gguf
|