Update README.md
Browse files
README.md
CHANGED
|
@@ -76,7 +76,7 @@ output\.weight=iq6_k
|
|
| 76 |
```
|
| 77 |
|
| 78 |
|
| 79 |
-
- Mostly iq5_ks GPU layers to minimize loss cheaply, keep it fast, and minimize the number of quantization types.
|
| 80 |
|
| 81 |
- iq3_ks shared experts near the beginning and end, as this seems to be where there are perplexity 'bumps.'
|
| 82 |
|
|
|
|
| 76 |
```
|
| 77 |
|
| 78 |
|
| 79 |
+
- Mostly iq5_ks GPU layers to minimize loss cheaply, keep it fast (as iqX_ks quantizations are very fast), and minimize the number of quantization types.
|
| 80 |
|
| 81 |
- iq3_ks shared experts near the beginning and end, as this seems to be where there are perplexity 'bumps.'
|
| 82 |
|