BlackSheep-24B-GPTQ

This repository contains a 4 bit GPTQ-quantized version of the TroyDoesAI/BlackSheep-24B model using llm-compressor.

Quantization Settings

Attribute	Value
Algorithm	GPTQ
Layers	Linear
Weight Scheme	W4A16
Group Size	128
Calibration Dataset	openerotica/erotiquant3
Calibration Sequence Length	4900
Calibration Samples	1024

The dataset was preprocessed with the following steps:

Extract and structure the conversation data using role-based templates (SYSTEM, USER, ASSISTANT).
Convert the structured conversations into a tokenized format using the model's tokenizer.
Filter out sequences shorter than 4096 tokens.
Shuffle and select 512 samples for calibration.

View the shell and python script used to quantize this model.

1 local 3090 128GB of ram.

Safetensors

Model size

4B params

Tensor type

I64

I32

F16

Base model

Quantized

(8)

this model