Size listings updated
Browse files
README.md
CHANGED
|
@@ -1,15 +1,17 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
tags:
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
base_model: Qwen/Qwen3-0.6B
|
| 12 |
author: geoffmunn
|
|
|
|
|
|
|
| 13 |
---
|
| 14 |
|
| 15 |
# Qwen3-0.6B-GGUF
|
|
@@ -22,17 +24,17 @@ Converted for use with `llama.cpp` and compatible tools like OpenWebUI, LM Studi
|
|
| 22 |
|
| 23 |
## Available Quantizations (from f16)
|
| 24 |
|
| 25 |
-
| Level | Quality | Speed | Size
|
| 26 |
|----------|--------------|----------|-----------|----------------|
|
| 27 |
-
| Q2_K | Minimal | β‘ Fastest |
|
| 28 |
-
| Q3_K_S | Low | β‘ Fast |
|
| 29 |
-
| Q3_K_M | Low-Medium | β‘ Fast |
|
| 30 |
-
| Q4_K_S | Medium | π Fast |
|
| 31 |
-
| Q4_K_M | β
Practical | π Fast |
|
| 32 |
-
| Q5_K_S | High | π’ Medium |
|
| 33 |
-
| Q5_K_M | πΊ Max Reasoning | π’ Medium |
|
| 34 |
-
| Q6_K | Near-FP16 | π Slow |
|
| 35 |
-
| Q8_0 | Lossless* | π Slow |
|
| 36 |
|
| 37 |
> π‘ **Recommendations by Use Case**
|
| 38 |
>
|
|
@@ -78,4 +80,4 @@ sha256sum -c SHA256SUMS.txt
|
|
| 78 |
|
| 79 |
## License
|
| 80 |
|
| 81 |
-
Apache 2.0 β see base model for full terms.
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
tags:
|
| 4 |
+
- gguf
|
| 5 |
+
- qwen
|
| 6 |
+
- llama.cpp
|
| 7 |
+
- quantized
|
| 8 |
+
- text-generation
|
| 9 |
+
- tiny-model
|
| 10 |
+
- edge-ai
|
| 11 |
base_model: Qwen/Qwen3-0.6B
|
| 12 |
author: geoffmunn
|
| 13 |
+
language:
|
| 14 |
+
- en
|
| 15 |
---
|
| 16 |
|
| 17 |
# Qwen3-0.6B-GGUF
|
|
|
|
| 24 |
|
| 25 |
## Available Quantizations (from f16)
|
| 26 |
|
| 27 |
+
| Level | Quality | Speed | Size | Recommendation |
|
| 28 |
|----------|--------------|----------|-----------|----------------|
|
| 29 |
+
| Q2_K | Minimal | β‘ Fastest | 347 MB | Use only on severely constrained systems (e.g., Raspberry Pi). Severely degraded output. |
|
| 30 |
+
| Q3_K_S | Low | β‘ Fast | 390 MB | Barely usable; slight improvement over Q2_K. Avoid unless space-limited. |
|
| 31 |
+
| Q3_K_M | Low-Medium | β‘ Fast | 414 MB | Usable for simple prompts on older CPUs. Acceptable for basic chat. |
|
| 32 |
+
| Q4_K_S | Medium | π Fast | 471 MB | Good balance for low-end devices. Recommended for embedded or mobile use. |
|
| 33 |
+
| Q4_K_M | β
Practical | π Fast | 484 MB | Best overall choice for most users. Solid performance on weak hardware. |
|
| 34 |
+
| Q5_K_S | High | π’ Medium | 544 MB | Slight quality gain; good for testing or when extra fidelity matters. |
|
| 35 |
+
| Q5_K_M | πΊ Max Reasoning | π’ Medium | 551 MB | Best quality available for this model. Use if you need slightly better logic or coherence. |
|
| 36 |
+
| Q6_K | Near-FP16 | π Slow | 623 MB | Diminishing returns. Only use if full consistency is critical and RAM allows. |
|
| 37 |
+
| Q8_0 | Lossless* | π Slow | 805 MB | Maximum fidelity, but gains are minor due to model size. Ideal for archival or benchmarking. |
|
| 38 |
|
| 39 |
> π‘ **Recommendations by Use Case**
|
| 40 |
>
|
|
|
|
| 80 |
|
| 81 |
## License
|
| 82 |
|
| 83 |
+
Apache 2.0 β see base model for full terms.
|