Can we create a ..."GLM-4.6-Distill-GLM-4.5-Air-GGUF"?
#13
by
NKLAR5
- opened
I heard Zai once said they wouldn't make a GLM4.6 Air version, but this should be something many people need, especially given GLM 4.6's outstanding programming capabilities.
I'm working on true 1 bpw quantization. That would bring GLM 4.6 into the same range as GLM Air 4.5 in Q2_XL.
This needs too much vram this model is 49 x 4 + 21 = 217GB damn this needs 217GB / 24 = 9.04GPUS
The Q6 on AMD Ryzen™ AI Max+ 395 cant even run, and lower quant is pointless in coding.