Python script to convert and quantize models to gguf

by ankmaury - opened Sep 24

Discussion

ankmaury

Sep 24

Hi @calcuis ,
Can you please share the script used to convert and quantize these models to gguf ?

calcuis

Owner Sep 24

it's a simple conversion; you could convert it with gguf-connector or gguf-node; and further quantize it with gguf-cutter; if it doesn't work for you, please refer to script here for building your own quantizor

ankmaury

Sep 26

Hi @calcuis ,
How can I use the gguf-convertor to convert my pytorch FP32 files to GGUF FP16 ?

calcuis

Owner Sep 26

simple method: you could use the convertor zero from gguf-node (pypi|repo|pack)

ankmaury

Sep 26

Hi @calcuis ,
I tried to quantize your t3_f16 model using llama_quantize. But I get this error.

./llama-quantize C:\Users\anmaurya\Downloads\t3_cfg-f16.gguf C:\Users\anmaurya\IGI\ggml\examples\tts\models_quant\T3-532M-Q2_K.gguf Q2_K
main: build = 6569 (e7890955)
main: built with MSVC 19.44.35213.0 for x64
main: quantizing 'C:\Users\anmaurya\Downloads\t3_cfg-f16.gguf' to 'C:\Users\anmaurya\IGI\ggml\examples\tts\models_quant\T3-532M-Q2_K.gguf' as Q2_K
llama_model_loader: loaded meta data with 3 key-value pairs and 292 tensors from C:\Users\anmaurya\Downloads\t3_cfg-f16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = pig
rya\IGI\ggml\examples\tts\models_quant\T3-532M-Q2_K.gguf Q2_K
main: build = 6569 (e7890955)
main: built with MSVC 19.44.35213.0 for x64
main: quantizing 'C:\Users\anmaurya\Downloads\t3_cfg-f16.gguf' to 'C:\Users\anmaurya\IGI\ggml\examples\tts\models_quant\T3-532M-Q2_K.gguf' as Q2_K
llama_model_loader: loaded meta data with 3 key-value pairs and 292 tensors from C:\Users\anmaurya\Downloads\t3_cfg-f16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = pig
llama_model_loader: - kv 1: general.quantization_version u32 = 2
llama_model_loader: loaded meta data with 3 key-value pairs and 292 tensors from C:\Users\anmaurya\Downloads\t3_cfg-f16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = pig
llama_model_loader: - kv 1: general.quantization_version u32 = 2
llama_model_loader: - kv 2: general.file_type u32 = 1
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = pig
llama_model_loader: - kv 1: general.quantization_version u32 = 2
llama_model_loader: - kv 2: general.file_type u32 = 1
llama_model_loader: - kv 1: general.quantization_version u32 = 2
llama_model_loader: - kv 2: general.file_type u32 = 1
llama_model_loader: - kv 2: general.file_type u32 = 1
llama_model_loader: - type f32: 70 tensors
llama_model_loader: - type f16: 222 tensors
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA RTX 3500 Ada Generation Laptop GPU, compute capability 8.9, VMM: yes
register_backend: registered backend CUDA (1 devices)
register_device: registered device CUDA0 (NVIDIA RTX 3500 Ada Generation Laptop GPU)
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (Intel(R) Core(TM) Ultra 7 165H)
llama_model_quantize: failed to quantize: unknown model architecture: 'pig'
main: failed to quantize model from 'C:\Users\anmaurya\Downloads\t3_cfg-f16.gguf'
(base) PS C:\Users\anmaurya\IGI\llama.cpp\build\bin\Debug>

How to go ahead with quantization ?

calcuis

Owner Sep 26

can't use the normal llama_quantize since not on their list, you will return an unknown model architecture; either use the gguf-cutter or build the custom quantizor for that task

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment