sillykiwi/Nemotron-H-4B-Instruct-128K-Q6_K-GGUF

This model was converted to GGUF format from nvidia/Nemotron-H-4B-Instruct-128K using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

KoboldCpp gives me errors when loading this. It is possible this quantization is corrupt.

Downloads last month
52
GGUF
Model size
4B params
Architecture
nemotron_h
Hardware compatibility
Log In to view the estimation

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sillykiwi/Nemotron-H-4B-Instruct-128K-Q6_K-GGUF

Quantized
(2)
this model