Update README.md
Browse files
README.md
CHANGED
|
@@ -30,6 +30,12 @@ data: `load_imatrix: loaded 314 importance matrix entries from imatrix_caesar.da
|
|
| 30 |
|
| 31 |
Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch).
|
| 32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
## Experimental from f16
|
| 34 |
|
| 35 |
| Filename | Quant type | File Size | Description | Example Image |
|
|
|
|
| 30 |
|
| 31 |
Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch).
|
| 32 |
|
| 33 |
+
Dynamic quantization:
|
| 34 |
+
- img_in, guidance_in.in_layer, final_layer.linear: f32/bf16/f16
|
| 35 |
+
- guidance_in, final_layer: bf16/f16
|
| 36 |
+
- img_attn.qkv, linear1: two bits up
|
| 37 |
+
- txt_mod.lin, txt_mlp, txt_attn.proj: one bit down
|
| 38 |
+
|
| 39 |
## Experimental from f16
|
| 40 |
|
| 41 |
| Filename | Quant type | File Size | Description | Example Image |
|