ubergarm commited on
Commit
b477b64
·
1 Parent(s): b20fd34

uploading new recipe IQ3_K and update graph

Browse files
Files changed (2) hide show
  1. README.md +2 -2
  2. images/perplexity.png +2 -2
README.md CHANGED
@@ -51,7 +51,7 @@ This is the "full quality" baseline version of the model and the only one in thi
51
  ```bash
52
  #!/usr/bin/env bash
53
 
54
- # Q4_0 routed experts approximating original QAT design
55
  # Q8_0 everything else
56
 
57
  custom="
@@ -154,7 +154,7 @@ numactl -N ${SOCKET} -m ${SOCKET} \
154
  ## IQ3_K 459.432 GiB (3.845 BPW)
155
  Final estimate: PPL = 2.1456 +/- 0.00941
156
 
157
- *NOTE*: Given there were some issues with the original q4_0 quantization, I've replaced the original IQ3_K with this new smaller one using the patched q4_x quantization. The original one was `459.432 GiB (3.845 BPW)` and will be squash deleted to save on public quota soon. This new one uses q4_x patched and only applies imatrix to the iq3_k tensors but *not* to the q8_0 or q4_x. More details in [discussion 4 here](https://huggingface.co/ubergarm/Kimi-K2-Thinking-GGUF/discussions/4#6918a268149cb086f69915ce). It has almost the same perplexity so a good improvement.
158
 
159
  <details>
160
 
 
51
  ```bash
52
  #!/usr/bin/env bash
53
 
54
+ # Q4_0 (patched) routed experts approximating original QAT design
55
  # Q8_0 everything else
56
 
57
  custom="
 
154
  ## IQ3_K 459.432 GiB (3.845 BPW)
155
  Final estimate: PPL = 2.1456 +/- 0.00941
156
 
157
+ *NOTE*: Given there were some issues with the original q4_0 quantization, I've replaced the original IQ3_K with this new smaller one using the patched q4_x quantization. The original one was `474.772 GiB (3.973 BPW)` and will be squash deleted to save on public quota soon. This new one uses q4_x patched and only applies imatrix to the iq3_k tensors but *not* to the q8_0 or q4_x. More details in [discussion 4 here](https://huggingface.co/ubergarm/Kimi-K2-Thinking-GGUF/discussions/4#6918a268149cb086f69915ce). It has almost the same perplexity so a good improvement.
158
 
159
  <details>
160
 
images/perplexity.png CHANGED

Git LFS Details

  • SHA256: eebf558ca2c5493ecef8cb628af03eb72b8421beb12eb93c36ca2f9295b278d2
  • Pointer size: 131 Bytes
  • Size of remote file: 165 kB

Git LFS Details

  • SHA256: d52cb33b632fbe21790be92dc1e91de7fa1061fc90ba475f9f66b87df42d4ab1
  • Pointer size: 131 Bytes
  • Size of remote file: 179 kB