Eviation commited on
Commit
2573ba8
·
verified ·
1 Parent(s): c679689

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -30,6 +30,12 @@ data: `load_imatrix: loaded 314 importance matrix entries from imatrix_caesar.da
30
 
31
  Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch).
32
 
 
 
 
 
 
 
33
  ## Experimental from f16
34
 
35
  | Filename | Quant type | File Size | Description | Example Image |
 
30
 
31
  Using [llama.cpp quantize cae9fb4](https://github.com/ggerganov/llama.cpp/commit/cae9fb4361138b937464524eed907328731b81f6) with modified [lcpp.patch](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/lcpp.patch).
32
 
33
+ Dynamic quantization:
34
+ - img_in, guidance_in.in_layer, final_layer.linear: f32/bf16/f16
35
+ - guidance_in, final_layer: bf16/f16
36
+ - img_attn.qkv, linear1: two bits up
37
+ - txt_mod.lin, txt_mlp, txt_attn.proj: one bit down
38
+
39
  ## Experimental from f16
40
 
41
  | Filename | Quant type | File Size | Description | Example Image |