|
|
--- |
|
|
quantized_by: gghfez |
|
|
pipeline_tag: text-generation |
|
|
base_model: deepseek-ai/DeepSeek-R1-Zero |
|
|
license: mit |
|
|
base_model_relation: quantized |
|
|
tags: |
|
|
- mla |
|
|
- imatrix |
|
|
- deepseek_v3 |
|
|
- conversational |
|
|
- ik_llama.cpp |
|
|
--- |
|
|
|
|
|
## `ik_llama.cpp` imatrix MLA Quantizations of deepseek-ai/DeepSeek-R1-Zero |
|
|
|
|
|
This is an IQ2_KS quant of deepseek-ai/DeepSeek-R1-Zero using [ubergarm](https://huggingface.co/ubergarm)'s IQ2_KS recipe from [ubergarm/DeepSeek-TNG-R1T2-Chimera-GGUF](https://huggingface.co/ubergarm/DeepSeek-TNG-R1T2-Chimera-GGUF). |
|
|
|
|
|
This quant collection **REQUIRES** [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork to support advanced non-linear SotA quants and Multi-Head Latent Attention (MLA). Do **not** download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc! |
|
|
|
|
|
I've uploaded the converted BF16 weights [gghfez/DeepSeek-R1-Zero-256x21B-BF16](https://huggingface.co/gghfez/DeepSeek-R1-Zero-256x21B-BF16) if I, or anyone else wants to create similar quants in the future. |
|
|
|
|
|
Note: I may be deleting gghfez/DeepSeek-R1-Zero-256x21B-BF16 shortly due to the new huggingface storage limits. |