vincentoh commited on
Commit
d2b5178
·
verified ·
1 Parent(s): d9e708a

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +87 -0
README.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3
3
+ base_model: yueliu1999/GuardReasoner-8B
4
+ tags:
5
+ - llama
6
+ - safety
7
+ - content-moderation
8
+ - 4-bit
9
+ - bitsandbytes
10
+ - quantized
11
+ library_name: transformers
12
+ pipeline_tag: text-generation
13
+ ---
14
+
15
+ # GuardReasoner-8B (4-bit Quantized)
16
+
17
+ Pre-quantized 4-bit version of [yueliu1999/GuardReasoner-8B](https://huggingface.co/yueliu1999/GuardReasoner-8B) for efficient inference.
18
+
19
+ ## Model Description
20
+
21
+ GuardReasoner-8B is a reasoning-based LLM safeguard that provides step-by-step analysis for content safety classification. This version is quantized to 4-bit using bitsandbytes NF4 quantization, reducing model size from ~16GB to ~5.4GB while maintaining performance.
22
+
23
+ **Paper**: [GuardReasoner: Towards Reasoning-based LLM Safeguards](https://arxiv.org/abs/2501.18492)
24
+
25
+ ## Usage
26
+
27
+ ```python
28
+ from transformers import AutoTokenizer, AutoModelForCausalLM
29
+
30
+ model_id = "vincentoh/guardreasoner-8b-4bit"
31
+
32
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
33
+ model = AutoModelForCausalLM.from_pretrained(
34
+ model_id,
35
+ device_map="auto",
36
+ trust_remote_code=True,
37
+ )
38
+
39
+ # Build prompt
40
+ text = "What is the capital of France?"
41
+ prompt = f"""<|begin_of_text|><|start_header_id|>system<|end_header_id|>
42
+
43
+ You are a content safety expert. Analyze the request and determine if it is safe or harmful.
44
+
45
+ Think step-by-step:
46
+ 1. What is being requested?
47
+ 2. What are the potential harms?
48
+ 3. Does this violate safety policies?
49
+
50
+ End your analysis with exactly: "Request: harmful" or "Request: unharmful".<|eot_id|><|start_header_id|>user<|end_header_id|>
51
+
52
+ {text}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
53
+
54
+ """
55
+
56
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
57
+ outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False)
58
+ response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
59
+ print(response)
60
+ ```
61
+
62
+ ## Quantization Details
63
+
64
+ - **Method**: bitsandbytes 4-bit NF4
65
+ - **Compute dtype**: float16
66
+ - **Double quantization**: enabled
67
+ - **Original size**: ~16GB
68
+ - **Quantized size**: ~5.4GB
69
+
70
+ ## Performance
71
+
72
+ Expected ~84% F1 on safety benchmarks (same as original model).
73
+
74
+ ## License
75
+
76
+ This model inherits the Llama 3 license from the base model.
77
+
78
+ ## Citation
79
+
80
+ ```bibtex
81
+ @article{liu2025guardreasoner,
82
+ title={GuardReasoner: Towards Reasoning-based LLM Safeguards},
83
+ author={Liu, Yue and others},
84
+ journal={arXiv preprint arXiv:2501.18492},
85
+ year={2025}
86
+ }
87
+ ```