neonconverse commited on
Commit
e18adc5
·
verified ·
1 Parent(s): c17edce

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +40 -0
README.md ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ library_name: transformers
5
+ license: apache-2.0
6
+ tags:
7
+ - int8
8
+ - w8a8
9
+ - smoothquant
10
+ - gptq
11
+ - gemma-3
12
+ - abliterated
13
+ base_model: mlabonne/gemma-3-27b-it-abliterated
14
+ ---
15
+
16
+ # Gemma 3 27B Abliterated - W8A8 INT8
17
+
18
+ Quantized version of [mlabonne/gemma-3-27b-it-abliterated](https://huggingface.co/mlabonne/gemma-3-27b-it-abliterated) using W8A8.
19
+
20
+ ## Quantization Config
21
+
22
+ - **Method**: SmoothQuant + GPTQ
23
+ - **Precision**: 8-bit weights, 8-bit activations
24
+ - **SmoothQuant**: smoothing_strength=0.5
25
+ - **GPTQ**: scheme=W8A8, block_size=128
26
+ - **Calibration**: 512 samples from ultrachat-200k, max_seq_length=2048
27
+ - **Model size**: ~27 GB
28
+
29
+ ## Usage
30
+
31
+ ```python
32
+ from transformers import AutoModelForCausalLM, AutoTokenizer
33
+
34
+ model = AutoModelForCausalLM.from_pretrained(
35
+ "neonconverse/gemma-3-27b-abliterated-w8a8-8bit",
36
+ device_map="auto",
37
+ torch_dtype="auto"
38
+ )
39
+ tokenizer = AutoTokenizer.from_pretrained("neonconverse/gemma-3-27b-abliterated-w8a8-8bit")
40
+ ```