AWQ version of the Kortix Fast Apply 1.5B

--model av-codes/kortix-fast-apply-1.5B-v1.0-awq
--enforce-eager
--disable-log-requests
--quantization awq_marlin
--max-model-len 8192
--gpu-memory-utilization 0.25

Downloads last month: 46

Safetensors

Model size

2B params

Tensor type

I32

BF16

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for av-codes/kortix-fast-apply-1.5B-v1.0-awq

Base model

Kortix/FastApply-1.5B-v1.0

Quantized

(8)

this model