Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -12,6 +12,7 @@ tags:
   - chat
   - edge-ai
   - tiny-model
 base_model: Qwen/Qwen3-0.6B
 author: geoffmunn
 pipeline_tag: text-generation
@@ -20,7 +21,7 @@ language:
   - zh
 ---
-# Qwen3-0.6B-GGUF
 This is a **GGUF-quantized version** of the **[Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)** language model — a compact **600-million-parameter** LLM designed for **ultra-fast inference on low-resource devices**.
@@ -63,7 +64,7 @@ It’s ideal for:
 I have run each of these models across 6 questions, and ranked them all based on the quality of the anwsers.
 **Qwen3-0.6B-f16:Q5_K_M** is the best model across all question types, but if you want to play it safe with a higher precision model, then you could consider using **Qwen3-0.6B:Q8_0**.
-You can read the results here: [Qwen3-0.6b-analysis.md](Qwen3-0.6b-analysis.md)
 If you find this useful, please give the project a ❤️ like.
@@ -80,7 +81,7 @@ Each quantized model includes its own `README.md` and shares a common `MODELFILE
 Importing directly into Ollama should work, but you might encounter this error: `Error: invalid character '<' looking for beginning of value`.
 In this case try these steps:
-1. `wget https://huggingface.co/geoffmunn/Qwen3-0.6B/resolve/main/Qwen3-0.6B-f16%3AQ3_K_M.gguf` (replace the quantised version with the one you want)
 2. `nano Modelfile` and enter these details (again, replacing Q3_K_M with the version you want):
 ```text
 FROM ./Qwen3-0.6B-f16:Q3_K_M.gguf

   - chat
   - edge-ai
   - tiny-model
+  - imatrix
 base_model: Qwen/Qwen3-0.6B
 author: geoffmunn
 pipeline_tag: text-generation
   - zh
 ---
+# Qwen3-0.6B-f16-GGUF
 This is a **GGUF-quantized version** of the **[Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)** language model — a compact **600-million-parameter** LLM designed for **ultra-fast inference on low-resource devices**.
 I have run each of these models across 6 questions, and ranked them all based on the quality of the anwsers.
 **Qwen3-0.6B-f16:Q5_K_M** is the best model across all question types, but if you want to play it safe with a higher precision model, then you could consider using **Qwen3-0.6B:Q8_0**.
+You can read the results here: [Qwen3-0.6b-f16-analysis.md](Qwen3-0.6b-f16-analysis.md)
 If you find this useful, please give the project a ❤️ like.
 Importing directly into Ollama should work, but you might encounter this error: `Error: invalid character '<' looking for beginning of value`.
 In this case try these steps:
+1. `wget https://huggingface.co/geoffmunn/Qwen3-0.6B-f16/resolve/main/Qwen3-0.6B-f16%3AQ3_K_M.gguf` (replace the quantised version with the one you want)
 2. `nano Modelfile` and enter these details (again, replacing Q3_K_M with the version you want):
 ```text
 FROM ./Qwen3-0.6B-f16:Q3_K_M.gguf