GGUF quant
#3
by
sm54
- opened
Hi, do you have the bf16 gguf for this model? I tried creating one but it doesn't work, and errors out using the latest llama cpp.
Thanks,
Yeah, we need some Q2-Q4 GGUFs of this one! Cheers!
I learnt from https://www.reddit.com/r/LocalLLaMA/comments/1oh57ys/comment/nlmw0za/ that convert_hf_to_gguf.py can now directly convert from FP8 safetensors files to BF16 gguf, since https://github.com/ggml-org/llama.cpp/pull/14810 got merged few days ago.
Just cooked up a IQ4_KS quant and it looks good so far π