How to use "imatrix"?

#1
by bidedadi12345 - opened

How to use this "Huihui-gpt-oss-20b-BF16-abliterated-v2.imatrix.gguf" file and its corresponding model in LM Studio?

You don't. You only need it when creating your own weighted/imatrix quants using llama-quantize present inside llama.cpp. All quants in this repository are already weighted/imatrix quants so just download and use them instead.

Lmstudio Load Huihui-gpt-oss-20b-BF16-abliterated-v2.i1-MXFP4_MOE.gguf, Very good XD

I'm running ollama run hf.co/mradermacher/Huihui-gpt-oss-20b-BF16-abliterated-v2-i1-GGUF:IQ1_S and try any command but response is always empty, why is that ?

@kevbarns You almost certainly want to use Huihui-gpt-oss-20b-BF16-abliterated-v2.i1-MXFP4_MOE.gguf and NOT IQ1_S as for this model there is no meaningful size difference as we don't requantize the 4bit layers which basically are almost the entire model. I'm not at all surprised very low bpw quants don't work. Beside that your ollama command feels wrong. I don't use ollama myself but it feels strange that you don't need to specify the GGUF filename. Check if there is any error log. There likely is an error.

Sign up or log in to comment