Gemma-3-1b-it Q4_0 Quantized Model
This is a Q4_0 quantized version of the google/gemma-3-1b-it model, converted to GGUF format and optimized for efficient inference. It was created using llama.cpp tools in Google Colab.
Model Details
- Base Model: google/gemma-3-1b-it
- Quantization: Q4_0 (4-bit quantization)
- Format: GGUF
- Size: ~1โ1.5 GB
- Converted Using:
llama.cpp(commit from April 2025) - License: Inherits the license from
google/gemma-3-1b-it
Usage
To use this model with llama.cpp:
./llama-cli -m gemma-3-1b-it-Q4_0.gguf --prompt "Hello, world!" --no-interactive
How It Was Created
- Downloaded
google/gemma-3-1b-itfrom Hugging Face. - Converted to GGUF using
convert_hf_to_gguf.py. - Quantized to Q4_0 using
llama-quantizefromllama.cpp. - Tested in Google Colab with
llama-cli.
Limitations
- Quantization may reduce accuracy compared to the original model.
- Requires
llama.cppor compatible software for inference.
Acknowledgments
- Based on the work of bartowski for GGUF quantization.
- Uses
llama.cppby Georgi Gerganov.
- Downloads last month
- 18
Hardware compatibility
Log In
to view the estimation
4-bit