geoffmunn commited on
Commit
fb02b23
·
verified ·
1 Parent(s): 099c2cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -12,6 +12,7 @@ tags:
12
  - chat
13
  - edge-ai
14
  - tiny-model
 
15
  base_model: Qwen/Qwen3-0.6B
16
  author: geoffmunn
17
  pipeline_tag: text-generation
@@ -20,7 +21,7 @@ language:
20
  - zh
21
  ---
22
 
23
- # Qwen3-0.6B-GGUF
24
 
25
  This is a **GGUF-quantized version** of the **[Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)** language model — a compact **600-million-parameter** LLM designed for **ultra-fast inference on low-resource devices**.
26
 
@@ -63,7 +64,7 @@ It’s ideal for:
63
  I have run each of these models across 6 questions, and ranked them all based on the quality of the anwsers.
64
  **Qwen3-0.6B-f16:Q5_K_M** is the best model across all question types, but if you want to play it safe with a higher precision model, then you could consider using **Qwen3-0.6B:Q8_0**.
65
 
66
- You can read the results here: [Qwen3-0.6b-analysis.md](Qwen3-0.6b-analysis.md)
67
 
68
  If you find this useful, please give the project a ❤️ like.
69
 
@@ -80,7 +81,7 @@ Each quantized model includes its own `README.md` and shares a common `MODELFILE
80
  Importing directly into Ollama should work, but you might encounter this error: `Error: invalid character '<' looking for beginning of value`.
81
  In this case try these steps:
82
 
83
- 1. `wget https://huggingface.co/geoffmunn/Qwen3-0.6B/resolve/main/Qwen3-0.6B-f16%3AQ3_K_M.gguf` (replace the quantised version with the one you want)
84
  2. `nano Modelfile` and enter these details (again, replacing Q3_K_M with the version you want):
85
  ```text
86
  FROM ./Qwen3-0.6B-f16:Q3_K_M.gguf
 
12
  - chat
13
  - edge-ai
14
  - tiny-model
15
+ - imatrix
16
  base_model: Qwen/Qwen3-0.6B
17
  author: geoffmunn
18
  pipeline_tag: text-generation
 
21
  - zh
22
  ---
23
 
24
+ # Qwen3-0.6B-f16-GGUF
25
 
26
  This is a **GGUF-quantized version** of the **[Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)** language model — a compact **600-million-parameter** LLM designed for **ultra-fast inference on low-resource devices**.
27
 
 
64
  I have run each of these models across 6 questions, and ranked them all based on the quality of the anwsers.
65
  **Qwen3-0.6B-f16:Q5_K_M** is the best model across all question types, but if you want to play it safe with a higher precision model, then you could consider using **Qwen3-0.6B:Q8_0**.
66
 
67
+ You can read the results here: [Qwen3-0.6b-f16-analysis.md](Qwen3-0.6b-f16-analysis.md)
68
 
69
  If you find this useful, please give the project a ❤️ like.
70
 
 
81
  Importing directly into Ollama should work, but you might encounter this error: `Error: invalid character '<' looking for beginning of value`.
82
  In this case try these steps:
83
 
84
+ 1. `wget https://huggingface.co/geoffmunn/Qwen3-0.6B-f16/resolve/main/Qwen3-0.6B-f16%3AQ3_K_M.gguf` (replace the quantised version with the one you want)
85
  2. `nano Modelfile` and enter these details (again, replacing Q3_K_M with the version you want):
86
  ```text
87
  FROM ./Qwen3-0.6B-f16:Q3_K_M.gguf