--- language: - en library_name: transformers license: apache-2.0 tags: - int8 - w8a8 - smoothquant - gptq - gemma-3 - abliterated base_model: mlabonne/gemma-3-27b-it-abliterated --- # Gemma 3 27B Abliterated - W8A8 INT8 Quantized version of [mlabonne/gemma-3-27b-it-abliterated](https://huggingface.co/mlabonne/gemma-3-27b-it-abliterated) using W8A8. ## Quantization Config - **Method**: SmoothQuant + GPTQ - **Precision**: 8-bit weights, 8-bit activations - **SmoothQuant**: smoothing_strength=0.5 - **GPTQ**: scheme=W8A8, block_size=128 - **Calibration**: 512 samples from ultrachat-200k, max_seq_length=2048 - **Model size**: ~27 GB ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "neonconverse/gemma-3-27b-abliterated-w8a8-8bit", device_map="auto", torch_dtype="auto" ) tokenizer = AutoTokenizer.from_pretrained("neonconverse/gemma-3-27b-abliterated-w8a8-8bit") ```