LLMJapan
/

Qwen2.5-Coder-32B-Instruct_exl3

Text Generation

Model card Files Files and versions

LLMJapan commited on May 16

Commit

81a17bb

·

verified ·

1 Parent(s): df66b16

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ I used [exllamav3 version 0.0.2](https://github.com/turboderp-org/exllamav3/rele
 [8.0bpw](https://huggingface.co/LLMJapan/Qwen2.5-Coder-32B-Instruct_exl3/tree/8.0bpw)
-For coding, I found >6.0bpw or preferably 8.0bpw model with KV Cache Quantization (>Q6) is much better than 4.0bpw.
 If you are using these models only for short Auto Completion, 4.0bpw is usable.
 ## Credits

 [8.0bpw](https://huggingface.co/LLMJapan/Qwen2.5-Coder-32B-Instruct_exl3/tree/8.0bpw)
+For coding, I found >=6.0bpw or preferably 8.0bpw model with KV Cache Quantization (>=Q6) is much better than 4.0bpw.
 If you are using these models only for short Auto Completion, 4.0bpw is usable.
 ## Credits