Size listings updated

Browse files

Files changed (1) hide show

README.md +20 -18

README.md CHANGED Viewed

@@ -1,15 +1,17 @@
 ---
 license: apache-2.0
 tags:
-  - gguf
-  - qwen
-  - llama.cpp
-  - quantized
-  - text-generation
-  - tiny-model
-  - edge-ai
 base_model: Qwen/Qwen3-0.6B
 author: geoffmunn
 ---
 # Qwen3-0.6B-GGUF
@@ -22,17 +24,17 @@ Converted for use with `llama.cpp` and compatible tools like OpenWebUI, LM Studi
 ## Available Quantizations (from f16)
-| Level     | Quality       | Speed     | Size Est. | Recommendation |
 |----------|--------------|----------|-----------|----------------|
-| Q2_K     | Minimal      | ⚡ Fastest | ~0.3 GB   | Use only on severely constrained systems (e.g., Raspberry Pi). Severely degraded output. |
-| Q3_K_S   | Low          | ⚡ Fast    | ~0.4 GB   | Barely usable; slight improvement over Q2_K. Avoid unless space-limited. |
-| Q3_K_M   | Low-Medium   | ⚡ Fast    | ~0.4 GB   | Usable for simple prompts on older CPUs. Acceptable for basic chat. |
-| Q4_K_S   | Medium       | 🚀 Fast    | ~0.5 GB   | Good balance for low-end devices. Recommended for embedded or mobile use. |
-| Q4_K_M   | ✅ Practical  | 🚀 Fast    | ~0.5 GB   | Best overall choice for most users. Solid performance on weak hardware. |
-| Q5_K_S   | High         | 🐢 Medium  | ~0.5 GB   | Slight quality gain; good for testing or when extra fidelity matters. |
-| Q5_K_M   | 🔺 Max Reasoning | 🐢 Medium | ~0.5 GB | Best quality available for this model. Use if you need slightly better logic or coherence. |
-| Q6_K     | Near-FP16    | 🐌 Slow    | ~0.6 GB   | Diminishing returns. Only use if full consistency is critical and RAM allows. |
-| Q8_0     | Lossless*    | 🐌 Slow    | ~0.8 GB   | Maximum fidelity, but gains are minor due to model size. Ideal for archival or benchmarking. |
 > 💡 **Recommendations by Use Case**
 >
@@ -78,4 +80,4 @@ sha256sum -c SHA256SUMS.txt
 ## License
-Apache 2.0 – see base model for full terms.

 ---
 license: apache-2.0
 tags:
+- gguf
+- qwen
+- llama.cpp
+- quantized
+- text-generation
+- tiny-model
+- edge-ai
 base_model: Qwen/Qwen3-0.6B
 author: geoffmunn
+language:
+- en
 ---
 # Qwen3-0.6B-GGUF
 ## Available Quantizations (from f16)
+| Level     | Quality       | Speed     | Size      | Recommendation |
 |----------|--------------|----------|-----------|----------------|
+| Q2_K     | Minimal      | ⚡ Fastest | 347 MB   | Use only on severely constrained systems (e.g., Raspberry Pi). Severely degraded output. |
+| Q3_K_S   | Low          | ⚡ Fast    | 390 MB   | Barely usable; slight improvement over Q2_K. Avoid unless space-limited. |
+| Q3_K_M   | Low-Medium   | ⚡ Fast    | 414 MB   | Usable for simple prompts on older CPUs. Acceptable for basic chat. |
+| Q4_K_S   | Medium       | 🚀 Fast    | 471 MB   | Good balance for low-end devices. Recommended for embedded or mobile use. |
+| Q4_K_M   | ✅ Practical  | 🚀 Fast    | 484 MB   | Best overall choice for most users. Solid performance on weak hardware. |
+| Q5_K_S   | High         | 🐢 Medium  | 544 MB   | Slight quality gain; good for testing or when extra fidelity matters. |
+| Q5_K_M   | 🔺 Max Reasoning | 🐢 Medium | 551 MB | Best quality available for this model. Use if you need slightly better logic or coherence. |
+| Q6_K     | Near-FP16    | 🐌 Slow    | 623 MB   | Diminishing returns. Only use if full consistency is critical and RAM allows. |
+| Q8_0     | Lossless*    | 🐌 Slow    | 805 MB   | Maximum fidelity, but gains are minor due to model size. Ideal for archival or benchmarking. |
 > 💡 **Recommendations by Use Case**
 >
 ## License
+Apache 2.0 – see base model for full terms.