sillykiwi/Nemotron-H-4B-Instruct-128K-Q6_K-GGUF

This model was converted to GGUF format from nvidia/Nemotron-H-4B-Instruct-128K using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

KoboldCpp gives me errors when loading this. It is possible this quantization is corrupt.

GGUF

Model size

4B params

Architecture

nemotron_h

Hardware compatibility

6-bit

Model tree for sillykiwi/Nemotron-H-4B-Instruct-128K-Q6_K-GGUF

Base model

Finetuned

Quantized

(2)

this model