gghfez
/

DeepSeek-R1-Zero-IQ2_KS

Text Generation

Model card Files Files and versions

DeepSeek-R1-Zero-IQ2_KS / README.md

gghfez's picture

Create README.md

4090c7e verified about 1 month ago

|

1.15 kB

	---
	quantized_by: gghfez
	pipeline_tag: text-generation
	base_model: deepseek-ai/DeepSeek-R1-Zero
	license: mit
	base_model_relation: quantized
	tags:
	- mla
	- imatrix
	- deepseek_v3
	- conversational
	- ik_llama.cpp
	---

	## `ik_llama.cpp` imatrix MLA Quantizations of deepseek-ai/DeepSeek-R1-Zero

	This is an IQ2_KS quant of deepseek-ai/DeepSeek-R1-Zero using [ubergarm](https://huggingface.co/ubergarm)'s IQ2_KS recipe from [ubergarm/DeepSeek-TNG-R1T2-Chimera-GGUF](https://huggingface.co/ubergarm/DeepSeek-TNG-R1T2-Chimera-GGUF).

	This quant collection REQUIRES [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork to support advanced non-linear SotA quants and Multi-Head Latent Attention (MLA). Do not download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc!

	I've uploaded the converted BF16 weights [gghfez/DeepSeek-R1-Zero-256x21B-BF16](https://huggingface.co/gghfez/DeepSeek-R1-Zero-256x21B-BF16) if I, or anyone else wants to create similar quants in the future.

	Note: I may be deleting gghfez/DeepSeek-R1-Zero-256x21B-BF16 shortly due to the new huggingface storage limits.