This is FBL's una-cybertron-7b-v2, converted to GGUF. No other changes were made.

Two files are avaliable here:

  • una-cybertron-7b-v2-fp16.gguf: the original model converted to GGUF without quantization
  • una-cybertron-7b-v2-q8_0-LOT.gguf: the original model converted to GGUF with q8_0 quantization using the --leave-output-tensor command-line option

From llama.cpp/quantize --help:

--leave-output-tensor: Will leave output.weight un(re)quantized. Increases model size but may also increase quality, especially when requantizing

The model was converted using convert.py from Georgi Gerganov's llama.cpp repo, release b1620.

All credit belongs to FBL for fine-tuning and releasing this model. Thank you!

Downloads last month
2
GGUF
Model size
7B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support