gghfez's picture
Create README.md
4090c7e verified
metadata
quantized_by: gghfez
pipeline_tag: text-generation
base_model: deepseek-ai/DeepSeek-R1-Zero
license: mit
base_model_relation: quantized
tags:
  - mla
  - imatrix
  - deepseek_v3
  - conversational
  - ik_llama.cpp

ik_llama.cpp imatrix MLA Quantizations of deepseek-ai/DeepSeek-R1-Zero

This is an IQ2_KS quant of deepseek-ai/DeepSeek-R1-Zero using ubergarm's IQ2_KS recipe from ubergarm/DeepSeek-TNG-R1T2-Chimera-GGUF.

This quant collection REQUIRES ik_llama.cpp fork to support advanced non-linear SotA quants and Multi-Head Latent Attention (MLA). Do not download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc!

I've uploaded the converted BF16 weights gghfez/DeepSeek-R1-Zero-256x21B-BF16 if I, or anyone else wants to create similar quants in the future.

Note: I may be deleting gghfez/DeepSeek-R1-Zero-256x21B-BF16 shortly due to the new huggingface storage limits.