Update README.md
Browse files
README.md
CHANGED
|
@@ -12,6 +12,7 @@ tags:
|
|
| 12 |
- chat
|
| 13 |
- edge-ai
|
| 14 |
- tiny-model
|
|
|
|
| 15 |
base_model: Qwen/Qwen3-0.6B
|
| 16 |
author: geoffmunn
|
| 17 |
pipeline_tag: text-generation
|
|
@@ -20,7 +21,7 @@ language:
|
|
| 20 |
- zh
|
| 21 |
---
|
| 22 |
|
| 23 |
-
# Qwen3-0.6B-GGUF
|
| 24 |
|
| 25 |
This is a **GGUF-quantized version** of the **[Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)** language model — a compact **600-million-parameter** LLM designed for **ultra-fast inference on low-resource devices**.
|
| 26 |
|
|
@@ -63,7 +64,7 @@ It’s ideal for:
|
|
| 63 |
I have run each of these models across 6 questions, and ranked them all based on the quality of the anwsers.
|
| 64 |
**Qwen3-0.6B-f16:Q5_K_M** is the best model across all question types, but if you want to play it safe with a higher precision model, then you could consider using **Qwen3-0.6B:Q8_0**.
|
| 65 |
|
| 66 |
-
You can read the results here: [Qwen3-0.6b-analysis.md](Qwen3-0.6b-analysis.md)
|
| 67 |
|
| 68 |
If you find this useful, please give the project a ❤️ like.
|
| 69 |
|
|
@@ -80,7 +81,7 @@ Each quantized model includes its own `README.md` and shares a common `MODELFILE
|
|
| 80 |
Importing directly into Ollama should work, but you might encounter this error: `Error: invalid character '<' looking for beginning of value`.
|
| 81 |
In this case try these steps:
|
| 82 |
|
| 83 |
-
1. `wget https://huggingface.co/geoffmunn/Qwen3-0.6B/resolve/main/Qwen3-0.6B-f16%3AQ3_K_M.gguf` (replace the quantised version with the one you want)
|
| 84 |
2. `nano Modelfile` and enter these details (again, replacing Q3_K_M with the version you want):
|
| 85 |
```text
|
| 86 |
FROM ./Qwen3-0.6B-f16:Q3_K_M.gguf
|
|
|
|
| 12 |
- chat
|
| 13 |
- edge-ai
|
| 14 |
- tiny-model
|
| 15 |
+
- imatrix
|
| 16 |
base_model: Qwen/Qwen3-0.6B
|
| 17 |
author: geoffmunn
|
| 18 |
pipeline_tag: text-generation
|
|
|
|
| 21 |
- zh
|
| 22 |
---
|
| 23 |
|
| 24 |
+
# Qwen3-0.6B-f16-GGUF
|
| 25 |
|
| 26 |
This is a **GGUF-quantized version** of the **[Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)** language model — a compact **600-million-parameter** LLM designed for **ultra-fast inference on low-resource devices**.
|
| 27 |
|
|
|
|
| 64 |
I have run each of these models across 6 questions, and ranked them all based on the quality of the anwsers.
|
| 65 |
**Qwen3-0.6B-f16:Q5_K_M** is the best model across all question types, but if you want to play it safe with a higher precision model, then you could consider using **Qwen3-0.6B:Q8_0**.
|
| 66 |
|
| 67 |
+
You can read the results here: [Qwen3-0.6b-f16-analysis.md](Qwen3-0.6b-f16-analysis.md)
|
| 68 |
|
| 69 |
If you find this useful, please give the project a ❤️ like.
|
| 70 |
|
|
|
|
| 81 |
Importing directly into Ollama should work, but you might encounter this error: `Error: invalid character '<' looking for beginning of value`.
|
| 82 |
In this case try these steps:
|
| 83 |
|
| 84 |
+
1. `wget https://huggingface.co/geoffmunn/Qwen3-0.6B-f16/resolve/main/Qwen3-0.6B-f16%3AQ3_K_M.gguf` (replace the quantised version with the one you want)
|
| 85 |
2. `nano Modelfile` and enter these details (again, replacing Q3_K_M with the version you want):
|
| 86 |
```text
|
| 87 |
FROM ./Qwen3-0.6B-f16:Q3_K_M.gguf
|