GGUF quant

by sm54 - opened 18 days ago

sm54

18 days ago

Hi, do you have the bf16 gguf for this model? I tried creating one but it doesn't work, and errors out using the latest llama cpp.

Thanks,

17 days ago

Yeah, we need some Q2-Q4 GGUFs of this one! Cheers!

sokann

12 days ago

I learnt from https://www.reddit.com/r/LocalLLaMA/comments/1oh57ys/comment/nlmw0za/ that convert_hf_to_gguf.py can now directly convert from FP8 safetensors files to BF16 gguf, since https://github.com/ggml-org/llama.cpp/pull/14810 got merged few days ago.

Just cooked up a IQ4_KS quant and it looks good so far 😁

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment