geoffmunn commited on
Commit
c510220
·
verified ·
1 Parent(s): 34fe1f3

Add Q2–Q8_0 quantized models with per-model cards, MODELFILE, CLI examples, and auto-upload

Browse files
.gitattributes CHANGED
@@ -33,3 +33,12 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ Qwen3Guard-Gen-4B-f16:Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
37
+ Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
38
+ Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
39
+ Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
40
+ Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
41
+ Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
42
+ Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
43
+ Qwen3Guard-Gen-4B-f16:Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
44
+ Qwen3Guard-Gen-4B-f16:Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
MODELFILE ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MODELFILE for Qwen3Guard-Gen-4B
2
+ # Used by LM Studio, OpenWebUI, GPT4All, etc.
3
+
4
+ context_length: 32768
5
+ embedding: false
6
+ f16: cpu
7
+
8
+ # Chat template using ChatML (used by Qwen)
9
+ prompt_template: >-
10
+ <|im_start|>system
11
+ You are a helpful assistant who always refuses harmful requests.<|im_end|>
12
+ <|im_start|>user
13
+ {prompt}<|im_end|>
14
+ <|im_start|>assistant
15
+
16
+ # Stop sequences help end generation cleanly
17
+ stop: "<|im_end|>"
18
+ stop: "<|im_start|>"
19
+
20
+ # Default sampling (optimized for safe generation)
21
+ temperature: 0.7
22
+ top_p: 0.9
23
+ top_k: 20
24
+ min_p: 0.05
25
+ repeat_penalty: 1.1
Qwen3Guard-Gen-4B-Q2_K/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - gguf
5
+ - safety
6
+ - guardrail
7
+ - qwen
8
+ - text-generation
9
+ base_model: Qwen/Qwen3Guard-Gen-4B
10
+ author: geoffmunn
11
+ ---
12
+
13
+ # Qwen3Guard-Gen-4B-Q2_K
14
+
15
+ Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
16
+
17
+ ## Model Info
18
+ - **Type**: Generative LLM with built-in safety
19
+ - **Size**: 1.7G
20
+ - **RAM Required**: ~2.0 GB
21
+ - **Speed**: ⚡ Fast
22
+ - **Quality**: Low
23
+ - **Recommendation**: Only for very weak devices; poor reasoning. Avoid.
24
+
25
+ ## 🧑‍🏫 Beginner Example
26
+
27
+ 1. Load in **LM Studio**
28
+ 2. Type:
29
+ ```
30
+ How do I hack my school's WiFi?
31
+ ```
32
+ 3. The model replies:
33
+ ```
34
+ I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
35
+ ```
36
+
37
+ > ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
38
+
39
+ ## ⚙️ Default Parameters (Recommended)
40
+
41
+ | Parameter | Value | Why |
42
+ |---------|-------|-----|
43
+ | Temperature | 0.7 | Balanced creativity and coherence |
44
+ | Top-P | 0.9 | Broad sampling without randomness |
45
+ | Top-K | 20 | Focused candidate pool |
46
+ | Min-P | 0.05 | Prevents rare token collapse |
47
+ | Repeat Penalty | 1.1 | Reduces repetition |
48
+ | Context Length | 32768 | Full Qwen3 context support |
49
+
50
+ > 🔁 Enable thinking mode for logic: add `/think` in prompt
51
+
52
+ ## 🖥️ CLI Example Using llama.cpp
53
+
54
+ ```bash
55
+ ./main -m Qwen3Guard-Gen-4B-f16:Q2_K.gguf \
56
+ -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
57
+ --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
58
+ --n-predict 512
59
+ ```
60
+
61
+ Expected output:
62
+ > Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
63
+
64
+ ## 🧩 Prompt Template (ChatML Format)
65
+
66
+ Use ChatML for best results:
67
+
68
+ ```text
69
+ <|im_start|>system
70
+ You are a helpful assistant who always refuses harmful requests.<|im_end|>
71
+ <|im_start|>user
72
+ {prompt}<|im_end|>
73
+ <|im_start|>assistant
74
+ ```
75
+
76
+ Most tools (LM Studio, OpenWebUI) will apply this automatically.
77
+
78
+ ## License
79
+
80
+ Apache 2.0
Qwen3Guard-Gen-4B-Q3_K_M/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - gguf
5
+ - safety
6
+ - guardrail
7
+ - qwen
8
+ - text-generation
9
+ base_model: Qwen/Qwen3Guard-Gen-4B
10
+ author: geoffmunn
11
+ ---
12
+
13
+ # Qwen3Guard-Gen-4B-Q3_K_M
14
+
15
+ Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
16
+
17
+ ## Model Info
18
+ - **Type**: Generative LLM with built-in safety
19
+ - **Size**: 2.1G
20
+ - **RAM Required**: ~2.5 GB
21
+ - **Speed**: ⚡ Fast
22
+ - **Quality**: Low-Med
23
+ - **Recommendation**: Basic generation; acceptable for simple tasks.
24
+
25
+ ## 🧑‍🏫 Beginner Example
26
+
27
+ 1. Load in **LM Studio**
28
+ 2. Type:
29
+ ```
30
+ How do I hack my school's WiFi?
31
+ ```
32
+ 3. The model replies:
33
+ ```
34
+ I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
35
+ ```
36
+
37
+ > ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
38
+
39
+ ## ⚙️ Default Parameters (Recommended)
40
+
41
+ | Parameter | Value | Why |
42
+ |---------|-------|-----|
43
+ | Temperature | 0.7 | Balanced creativity and coherence |
44
+ | Top-P | 0.9 | Broad sampling without randomness |
45
+ | Top-K | 20 | Focused candidate pool |
46
+ | Min-P | 0.05 | Prevents rare token collapse |
47
+ | Repeat Penalty | 1.1 | Reduces repetition |
48
+ | Context Length | 32768 | Full Qwen3 context support |
49
+
50
+ > 🔁 Enable thinking mode for logic: add `/think` in prompt
51
+
52
+ ## 🖥️ CLI Example Using llama.cpp
53
+
54
+ ```bash
55
+ ./main -m Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf \
56
+ -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
57
+ --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
58
+ --n-predict 512
59
+ ```
60
+
61
+ Expected output:
62
+ > Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
63
+
64
+ ## 🧩 Prompt Template (ChatML Format)
65
+
66
+ Use ChatML for best results:
67
+
68
+ ```text
69
+ <|im_start|>system
70
+ You are a helpful assistant who always refuses harmful requests.<|im_end|>
71
+ <|im_start|>user
72
+ {prompt}<|im_end|>
73
+ <|im_start|>assistant
74
+ ```
75
+
76
+ Most tools (LM Studio, OpenWebUI) will apply this automatically.
77
+
78
+ ## License
79
+
80
+ Apache 2.0
Qwen3Guard-Gen-4B-Q3_K_S/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - gguf
5
+ - safety
6
+ - guardrail
7
+ - qwen
8
+ - text-generation
9
+ base_model: Qwen/Qwen3Guard-Gen-4B
10
+ author: geoffmunn
11
+ ---
12
+
13
+ # Qwen3Guard-Gen-4B-Q3_K_S
14
+
15
+ Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
16
+
17
+ ## Model Info
18
+ - **Type**: Generative LLM with built-in safety
19
+ - **Size**: 2.0G
20
+ - **RAM Required**: ~2.3 GB
21
+ - **Speed**: ⚡ Fast
22
+ - **Quality**: Low
23
+ - **Recommendation**: Minimal quality; may miss subtle risks.
24
+
25
+ ## 🧑‍🏫 Beginner Example
26
+
27
+ 1. Load in **LM Studio**
28
+ 2. Type:
29
+ ```
30
+ How do I hack my school's WiFi?
31
+ ```
32
+ 3. The model replies:
33
+ ```
34
+ I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
35
+ ```
36
+
37
+ > ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
38
+
39
+ ## ⚙️ Default Parameters (Recommended)
40
+
41
+ | Parameter | Value | Why |
42
+ |---------|-------|-----|
43
+ | Temperature | 0.7 | Balanced creativity and coherence |
44
+ | Top-P | 0.9 | Broad sampling without randomness |
45
+ | Top-K | 20 | Focused candidate pool |
46
+ | Min-P | 0.05 | Prevents rare token collapse |
47
+ | Repeat Penalty | 1.1 | Reduces repetition |
48
+ | Context Length | 32768 | Full Qwen3 context support |
49
+
50
+ > 🔁 Enable thinking mode for logic: add `/think` in prompt
51
+
52
+ ## 🖥️ CLI Example Using llama.cpp
53
+
54
+ ```bash
55
+ ./main -m Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf \
56
+ -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
57
+ --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
58
+ --n-predict 512
59
+ ```
60
+
61
+ Expected output:
62
+ > Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
63
+
64
+ ## 🧩 Prompt Template (ChatML Format)
65
+
66
+ Use ChatML for best results:
67
+
68
+ ```text
69
+ <|im_start|>system
70
+ You are a helpful assistant who always refuses harmful requests.<|im_end|>
71
+ <|im_start|>user
72
+ {prompt}<|im_end|>
73
+ <|im_start|>assistant
74
+ ```
75
+
76
+ Most tools (LM Studio, OpenWebUI) will apply this automatically.
77
+
78
+ ## License
79
+
80
+ Apache 2.0
Qwen3Guard-Gen-4B-Q4_K_M/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - gguf
5
+ - safety
6
+ - guardrail
7
+ - qwen
8
+ - text-generation
9
+ base_model: Qwen/Qwen3Guard-Gen-4B
10
+ author: geoffmunn
11
+ ---
12
+
13
+ # Qwen3Guard-Gen-4B-Q4_K_M
14
+
15
+ Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
16
+
17
+ ## Model Info
18
+ - **Type**: Generative LLM with built-in safety
19
+ - **Size**: 2.6G
20
+ - **RAM Required**: ~3.0 GB
21
+ - **Speed**: 🚀 Fast
22
+ - **Quality**: Balanced
23
+ - **Recommendation**: ✅ Best speed/quality balance.
24
+
25
+ ## 🧑‍🏫 Beginner Example
26
+
27
+ 1. Load in **LM Studio**
28
+ 2. Type:
29
+ ```
30
+ How do I hack my school's WiFi?
31
+ ```
32
+ 3. The model replies:
33
+ ```
34
+ I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
35
+ ```
36
+
37
+ > ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
38
+
39
+ ## ⚙️ Default Parameters (Recommended)
40
+
41
+ | Parameter | Value | Why |
42
+ |---------|-------|-----|
43
+ | Temperature | 0.7 | Balanced creativity and coherence |
44
+ | Top-P | 0.9 | Broad sampling without randomness |
45
+ | Top-K | 20 | Focused candidate pool |
46
+ | Min-P | 0.05 | Prevents rare token collapse |
47
+ | Repeat Penalty | 1.1 | Reduces repetition |
48
+ | Context Length | 32768 | Full Qwen3 context support |
49
+
50
+ > 🔁 Enable thinking mode for logic: add `/think` in prompt
51
+
52
+ ## 🖥️ CLI Example Using llama.cpp
53
+
54
+ ```bash
55
+ ./main -m Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf \
56
+ -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
57
+ --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
58
+ --n-predict 512
59
+ ```
60
+
61
+ Expected output:
62
+ > Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
63
+
64
+ ## 🧩 Prompt Template (ChatML Format)
65
+
66
+ Use ChatML for best results:
67
+
68
+ ```text
69
+ <|im_start|>system
70
+ You are a helpful assistant who always refuses harmful requests.<|im_end|>
71
+ <|im_start|>user
72
+ {prompt}<|im_end|>
73
+ <|im_start|>assistant
74
+ ```
75
+
76
+ Most tools (LM Studio, OpenWebUI) will apply this automatically.
77
+
78
+ ## License
79
+
80
+ Apache 2.0
Qwen3Guard-Gen-4B-Q4_K_S/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - gguf
5
+ - safety
6
+ - guardrail
7
+ - qwen
8
+ - text-generation
9
+ base_model: Qwen/Qwen3Guard-Gen-4B
10
+ author: geoffmunn
11
+ ---
12
+
13
+ # Qwen3Guard-Gen-4B-Q4_K_S
14
+
15
+ Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
16
+
17
+ ## Model Info
18
+ - **Type**: Generative LLM with built-in safety
19
+ - **Size**: 2.5G
20
+ - **RAM Required**: ~2.7 GB
21
+ - **Speed**: 🚀 Fast
22
+ - **Quality**: Medium
23
+ - **Recommendation**: Good for edge devices; decent output.
24
+
25
+ ## 🧑‍🏫 Beginner Example
26
+
27
+ 1. Load in **LM Studio**
28
+ 2. Type:
29
+ ```
30
+ How do I hack my school's WiFi?
31
+ ```
32
+ 3. The model replies:
33
+ ```
34
+ I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
35
+ ```
36
+
37
+ > ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
38
+
39
+ ## ⚙️ Default Parameters (Recommended)
40
+
41
+ | Parameter | Value | Why |
42
+ |---------|-------|-----|
43
+ | Temperature | 0.7 | Balanced creativity and coherence |
44
+ | Top-P | 0.9 | Broad sampling without randomness |
45
+ | Top-K | 20 | Focused candidate pool |
46
+ | Min-P | 0.05 | Prevents rare token collapse |
47
+ | Repeat Penalty | 1.1 | Reduces repetition |
48
+ | Context Length | 32768 | Full Qwen3 context support |
49
+
50
+ > 🔁 Enable thinking mode for logic: add `/think` in prompt
51
+
52
+ ## 🖥️ CLI Example Using llama.cpp
53
+
54
+ ```bash
55
+ ./main -m Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf \
56
+ -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
57
+ --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
58
+ --n-predict 512
59
+ ```
60
+
61
+ Expected output:
62
+ > Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
63
+
64
+ ## 🧩 Prompt Template (ChatML Format)
65
+
66
+ Use ChatML for best results:
67
+
68
+ ```text
69
+ <|im_start|>system
70
+ You are a helpful assistant who always refuses harmful requests.<|im_end|>
71
+ <|im_start|>user
72
+ {prompt}<|im_end|>
73
+ <|im_start|>assistant
74
+ ```
75
+
76
+ Most tools (LM Studio, OpenWebUI) will apply this automatically.
77
+
78
+ ## License
79
+
80
+ Apache 2.0
Qwen3Guard-Gen-4B-Q5_K_M/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - gguf
5
+ - safety
6
+ - guardrail
7
+ - qwen
8
+ - text-generation
9
+ base_model: Qwen/Qwen3Guard-Gen-4B
10
+ author: geoffmunn
11
+ ---
12
+
13
+ # Qwen3Guard-Gen-4B-Q5_K_M
14
+
15
+ Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
16
+
17
+ ## Model Info
18
+ - **Type**: Generative LLM with built-in safety
19
+ - **Size**: 3.0G
20
+ - **RAM Required**: ~3.3 GB
21
+ - **Speed**: 🐢 Medium
22
+ - **Quality**: High+
23
+ - **Recommendation**: ✅✅ Top choice for production safety apps.
24
+
25
+ ## 🧑‍🏫 Beginner Example
26
+
27
+ 1. Load in **LM Studio**
28
+ 2. Type:
29
+ ```
30
+ How do I hack my school's WiFi?
31
+ ```
32
+ 3. The model replies:
33
+ ```
34
+ I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
35
+ ```
36
+
37
+ > ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
38
+
39
+ ## ⚙️ Default Parameters (Recommended)
40
+
41
+ | Parameter | Value | Why |
42
+ |---------|-------|-----|
43
+ | Temperature | 0.7 | Balanced creativity and coherence |
44
+ | Top-P | 0.9 | Broad sampling without randomness |
45
+ | Top-K | 20 | Focused candidate pool |
46
+ | Min-P | 0.05 | Prevents rare token collapse |
47
+ | Repeat Penalty | 1.1 | Reduces repetition |
48
+ | Context Length | 32768 | Full Qwen3 context support |
49
+
50
+ > 🔁 Enable thinking mode for logic: add `/think` in prompt
51
+
52
+ ## 🖥️ CLI Example Using llama.cpp
53
+
54
+ ```bash
55
+ ./main -m Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf \
56
+ -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
57
+ --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
58
+ --n-predict 512
59
+ ```
60
+
61
+ Expected output:
62
+ > Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
63
+
64
+ ## 🧩 Prompt Template (ChatML Format)
65
+
66
+ Use ChatML for best results:
67
+
68
+ ```text
69
+ <|im_start|>system
70
+ You are a helpful assistant who always refuses harmful requests.<|im_end|>
71
+ <|im_start|>user
72
+ {prompt}<|im_end|>
73
+ <|im_start|>assistant
74
+ ```
75
+
76
+ Most tools (LM Studio, OpenWebUI) will apply this automatically.
77
+
78
+ ## License
79
+
80
+ Apache 2.0
Qwen3Guard-Gen-4B-Q5_K_S/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - gguf
5
+ - safety
6
+ - guardrail
7
+ - qwen
8
+ - text-generation
9
+ base_model: Qwen/Qwen3Guard-Gen-4B
10
+ author: geoffmunn
11
+ ---
12
+
13
+ # Qwen3Guard-Gen-4B-Q5_K_S
14
+
15
+ Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
16
+
17
+ ## Model Info
18
+ - **Type**: Generative LLM with built-in safety
19
+ - **Size**: 2.9G
20
+ - **RAM Required**: ~3.1 GB
21
+ - **Speed**: 🐢 Medium
22
+ - **Quality**: High
23
+ - **Recommendation**: High-quality responses; slightly faster than Q5_K_M.
24
+
25
+ ## 🧑‍🏫 Beginner Example
26
+
27
+ 1. Load in **LM Studio**
28
+ 2. Type:
29
+ ```
30
+ How do I hack my school's WiFi?
31
+ ```
32
+ 3. The model replies:
33
+ ```
34
+ I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
35
+ ```
36
+
37
+ > ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
38
+
39
+ ## ⚙️ Default Parameters (Recommended)
40
+
41
+ | Parameter | Value | Why |
42
+ |---------|-------|-----|
43
+ | Temperature | 0.7 | Balanced creativity and coherence |
44
+ | Top-P | 0.9 | Broad sampling without randomness |
45
+ | Top-K | 20 | Focused candidate pool |
46
+ | Min-P | 0.05 | Prevents rare token collapse |
47
+ | Repeat Penalty | 1.1 | Reduces repetition |
48
+ | Context Length | 32768 | Full Qwen3 context support |
49
+
50
+ > 🔁 Enable thinking mode for logic: add `/think` in prompt
51
+
52
+ ## 🖥️ CLI Example Using llama.cpp
53
+
54
+ ```bash
55
+ ./main -m Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf \
56
+ -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
57
+ --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
58
+ --n-predict 512
59
+ ```
60
+
61
+ Expected output:
62
+ > Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
63
+
64
+ ## 🧩 Prompt Template (ChatML Format)
65
+
66
+ Use ChatML for best results:
67
+
68
+ ```text
69
+ <|im_start|>system
70
+ You are a helpful assistant who always refuses harmful requests.<|im_end|>
71
+ <|im_start|>user
72
+ {prompt}<|im_end|>
73
+ <|im_start|>assistant
74
+ ```
75
+
76
+ Most tools (LM Studio, OpenWebUI) will apply this automatically.
77
+
78
+ ## License
79
+
80
+ Apache 2.0
Qwen3Guard-Gen-4B-Q6_K/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - gguf
5
+ - safety
6
+ - guardrail
7
+ - qwen
8
+ - text-generation
9
+ base_model: Qwen/Qwen3Guard-Gen-4B
10
+ author: geoffmunn
11
+ ---
12
+
13
+ # Qwen3Guard-Gen-4B-Q6_K
14
+
15
+ Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
16
+
17
+ ## Model Info
18
+ - **Type**: Generative LLM with built-in safety
19
+ - **Size**: 3.4G
20
+ - **RAM Required**: ~3.8 GB
21
+ - **Speed**: 🐌 Slow
22
+ - **Quality**: Near-FP16
23
+ - **Recommendation**: Excellent fidelity; ideal for precision tasks.
24
+
25
+ ## 🧑‍🏫 Beginner Example
26
+
27
+ 1. Load in **LM Studio**
28
+ 2. Type:
29
+ ```
30
+ How do I hack my school's WiFi?
31
+ ```
32
+ 3. The model replies:
33
+ ```
34
+ I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
35
+ ```
36
+
37
+ > ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
38
+
39
+ ## ⚙️ Default Parameters (Recommended)
40
+
41
+ | Parameter | Value | Why |
42
+ |---------|-------|-----|
43
+ | Temperature | 0.7 | Balanced creativity and coherence |
44
+ | Top-P | 0.9 | Broad sampling without randomness |
45
+ | Top-K | 20 | Focused candidate pool |
46
+ | Min-P | 0.05 | Prevents rare token collapse |
47
+ | Repeat Penalty | 1.1 | Reduces repetition |
48
+ | Context Length | 32768 | Full Qwen3 context support |
49
+
50
+ > 🔁 Enable thinking mode for logic: add `/think` in prompt
51
+
52
+ ## 🖥️ CLI Example Using llama.cpp
53
+
54
+ ```bash
55
+ ./main -m Qwen3Guard-Gen-4B-f16:Q6_K.gguf \
56
+ -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
57
+ --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
58
+ --n-predict 512
59
+ ```
60
+
61
+ Expected output:
62
+ > Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
63
+
64
+ ## 🧩 Prompt Template (ChatML Format)
65
+
66
+ Use ChatML for best results:
67
+
68
+ ```text
69
+ <|im_start|>system
70
+ You are a helpful assistant who always refuses harmful requests.<|im_end|>
71
+ <|im_start|>user
72
+ {prompt}<|im_end|>
73
+ <|im_start|>assistant
74
+ ```
75
+
76
+ Most tools (LM Studio, OpenWebUI) will apply this automatically.
77
+
78
+ ## License
79
+
80
+ Apache 2.0
Qwen3Guard-Gen-4B-Q8_0/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - gguf
5
+ - safety
6
+ - guardrail
7
+ - qwen
8
+ - text-generation
9
+ base_model: Qwen/Qwen3Guard-Gen-4B
10
+ author: geoffmunn
11
+ ---
12
+
13
+ # Qwen3Guard-Gen-4B-Q8_0
14
+
15
+ Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.
16
+
17
+ ## Model Info
18
+ - **Type**: Generative LLM with built-in safety
19
+ - **Size**: 4.4G
20
+ - **RAM Required**: ~5.0 GB
21
+ - **Speed**: 🐌 Slow
22
+ - **Quality**: Max
23
+ - **Recommendation**: Maximum accuracy; best for evaluation.
24
+
25
+ ## 🧑‍🏫 Beginner Example
26
+
27
+ 1. Load in **LM Studio**
28
+ 2. Type:
29
+ ```
30
+ How do I hack my school's WiFi?
31
+ ```
32
+ 3. The model replies:
33
+ ```
34
+ I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
35
+ ```
36
+
37
+ > ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation
38
+
39
+ ## ⚙️ Default Parameters (Recommended)
40
+
41
+ | Parameter | Value | Why |
42
+ |---------|-------|-----|
43
+ | Temperature | 0.7 | Balanced creativity and coherence |
44
+ | Top-P | 0.9 | Broad sampling without randomness |
45
+ | Top-K | 20 | Focused candidate pool |
46
+ | Min-P | 0.05 | Prevents rare token collapse |
47
+ | Repeat Penalty | 1.1 | Reduces repetition |
48
+ | Context Length | 32768 | Full Qwen3 context support |
49
+
50
+ > 🔁 Enable thinking mode for logic: add `/think` in prompt
51
+
52
+ ## 🖥️ CLI Example Using llama.cpp
53
+
54
+ ```bash
55
+ ./main -m Qwen3Guard-Gen-4B-f16:Q8_0.gguf \
56
+ -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
57
+ --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
58
+ --n-predict 512
59
+ ```
60
+
61
+ Expected output:
62
+ > Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
63
+
64
+ ## 🧩 Prompt Template (ChatML Format)
65
+
66
+ Use ChatML for best results:
67
+
68
+ ```text
69
+ <|im_start|>system
70
+ You are a helpful assistant who always refuses harmful requests.<|im_end|>
71
+ <|im_start|>user
72
+ {prompt}<|im_end|>
73
+ <|im_start|>assistant
74
+ ```
75
+
76
+ Most tools (LM Studio, OpenWebUI) will apply this automatically.
77
+
78
+ ## License
79
+
80
+ Apache 2.0
Qwen3Guard-Gen-4B-f16:Q2_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d0cc096b73b9a62f8a8b1d72fe2630f63acfcb70e19d70d128433d9f68bbcbf8
3
+ size 1797125792
Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6aa5c397cd2e67d1419a6512fbb4798e005149f5bdb51e97c86da0ba3df8bd31
3
+ size 2242747552
Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d764c3f4850f64f98b294780d3fdf6097cd7e95c1d7fac7f2463bdcadfb89d56
3
+ size 2054126752
Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:53903f9ff24741e227a93d41f34166d8501ae58b7d3c7d9c2cbbe7147af3bfac
3
+ size 2716068512
Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c5c215b89874319e766533daebf02b66b9135f21c0727cb913e8d8aeeca7ee4
3
+ size 2602097312
Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2a00c0b330d987efee5fa7d53e21ce6527fdd8a42a3dac6bb52008311744ad5c
3
+ size 3156920992
Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:209a6698018b3bda40269352b2032b88ab1b2e6261ec301827576858cf49ca3d
3
+ size 3091118752
Qwen3Guard-Gen-4B-f16:Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a99e1805893a26f9f7789d91483552b493b239244a06d3c40c9d917362e684c6
3
+ size 3625326752
Qwen3Guard-Gen-4B-f16:Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:43b0a22d6f17b83afd514dfff317b9c3dd6420a0a79464f881c2b8394ec0bb0a
3
+ size 4693671072
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - gguf
5
+ - qwen
6
+ - safety
7
+ - guardrail
8
+ - text-generation
9
+ - llama.cpp
10
+ base_model: Qwen/Qwen3Guard-Gen-4B
11
+ author: geoffmunn
12
+ pipeline_tag: text-generation
13
+ ---
14
+
15
+ # Qwen3Guard-Gen-4B-GGUF
16
+
17
+ This is a **GGUF-quantized version** of **[Qwen3Guard-Gen-4B](https://huggingface.co/Qwen/Qwen3Guard-Gen-4B)**, a **safety-aligned generative model** from Alibaba's Qwen team.
18
+
19
+ Unlike standard LLMs, this model is **fine-tuned to refuse harmful requests by design**, making it ideal for applications where content safety is critical.
20
+
21
+ > ⚠️ This is a **generative model with built-in safety constraints**, not a classifier like `Qwen3Guard-Stream-4B`.
22
+
23
+ ## 🛡 What Is Qwen3Guard-Gen-4B?
24
+
25
+ It’s a **helpful yet harmless assistant** trained to:
26
+ - Respond helpfully to safe queries
27
+ - Politely decline unsafe ones (e.g., illegal acts, self-harm)
28
+ - Avoid generating toxic, violent, or deceptive content
29
+ - Maintain factual consistency while being cautious
30
+
31
+ Perfect for:
32
+ - Educational chatbots
33
+ - Customer service agents
34
+ - Mental health support tools
35
+ - Moderated community bots
36
+
37
+ ## 🔗 Relationship to Other Safety Models
38
+
39
+ This model complements other Qwen3 safety tools:
40
+
41
+ | Model | Role | Best For |
42
+ |------|------|----------|
43
+ | **Qwen3Guard-Stream-4B** | ⚡ Input filter | Real-time moderation of user input |
44
+ | **Qwen3Guard-Gen-4B** | 🧠 Safe generator | Generating non-toxic responses |
45
+ | **Qwen3-4B-SafeRL** | 🛡️ Fully aligned agent | Multi-turn ethical conversations via RLHF |
46
+
47
+ ### Recommended Architecture
48
+ ```
49
+ User Input
50
+
51
+ [Optional: Qwen3Guard-Stream-4B] ← optional pre-filter
52
+
53
+ [Qwen3Guard-Gen-4B]
54
+
55
+ Safe Response
56
+ ```
57
+
58
+ You can run this model standalone or behind a streaming guard for maximum protection.
59
+
60
+ ## Available Quantizations
61
+
62
+ | Level | Size | RAM Usage | Use Case |
63
+ |--------|-------|-----------|----------|
64
+ | Q2_K | ~1.8 GB | ~2.0 GB | Only on weak hardware |
65
+ | Q3_K_S | ~2.1 GB | ~2.3 GB | Minimal viability |
66
+ | Q4_K_M | ~2.8 GB | ~3.0 GB | ✅ Balanced choice |
67
+ | Q5_K_M | ~3.1 GB | ~3.3 GB | ✅✅ Highest quality |
68
+ | Q6_K | ~3.5 GB | ~3.8 GB | Near-FP16 fidelity |
69
+ | Q8_0 | ~4.5 GB | ~5.0 GB | Maximum accuracy |
70
+
71
+ > 💡 **Recommendation**: Use **Q5_K_M** for best balance of safety reliability and response quality.
72
+
73
+ ## Tools That Support It
74
+ - [LM Studio](https://lmstudio.ai) – load and test locally
75
+ - [OpenWebUI](https://openwebui.com) – deploy with RAG and tools
76
+ - [GPT4All](https://gpt4all.io) – private, offline AI
77
+ - Directly via `llama.cpp`, Ollama, or TGI
78
+
79
+ ## Author
80
+ 👤 Geoff Munn (@geoffmunn)
81
+ 🔗 [Hugging Face Profile](https://huggingface.co/geoffmunn)
82
+
83
+ ## Disclaimer
84
+ Community conversion for local inference. Not affiliated with Alibaba Cloud.
SHA256SUMS.txt ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ d0cc096b73b9a62f8a8b1d72fe2630f63acfcb70e19d70d128433d9f68bbcbf8 Qwen3Guard-Gen-4B-f16:Q2_K.gguf
2
+ 6aa5c397cd2e67d1419a6512fbb4798e005149f5bdb51e97c86da0ba3df8bd31 Qwen3Guard-Gen-4B-f16:Q3_K_M.gguf
3
+ d764c3f4850f64f98b294780d3fdf6097cd7e95c1d7fac7f2463bdcadfb89d56 Qwen3Guard-Gen-4B-f16:Q3_K_S.gguf
4
+ 53903f9ff24741e227a93d41f34166d8501ae58b7d3c7d9c2cbbe7147af3bfac Qwen3Guard-Gen-4B-f16:Q4_K_M.gguf
5
+ 2c5c215b89874319e766533daebf02b66b9135f21c0727cb913e8d8aeeca7ee4 Qwen3Guard-Gen-4B-f16:Q4_K_S.gguf
6
+ 2a00c0b330d987efee5fa7d53e21ce6527fdd8a42a3dac6bb52008311744ad5c Qwen3Guard-Gen-4B-f16:Q5_K_M.gguf
7
+ 209a6698018b3bda40269352b2032b88ab1b2e6261ec301827576858cf49ca3d Qwen3Guard-Gen-4B-f16:Q5_K_S.gguf
8
+ a99e1805893a26f9f7789d91483552b493b239244a06d3c40c9d917362e684c6 Qwen3Guard-Gen-4B-f16:Q6_K.gguf
9
+ 43b0a22d6f17b83afd514dfff317b9c3dd6420a0a79464f881c2b8394ec0bb0a Qwen3Guard-Gen-4B-f16:Q8_0.gguf