geoffmunn commited on
Commit
65940d0
·
verified ·
1 Parent(s): 0fdecf5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +171 -101
README.md CHANGED
@@ -2,205 +2,275 @@
2
  base_model: Qwen/Qwen3-4B
3
  library_name: peft
4
  tags:
5
- - base_model:adapter:Qwen/Qwen3-4B
6
- - lora
7
- - transformers
 
 
 
8
  ---
9
 
10
- # Model Card for Model ID
11
-
12
- <!-- Provide a quick summary of what the model is/does. -->
13
-
14
 
 
 
 
15
 
16
  ## Model Details
17
 
18
  ### Model Description
19
 
20
- <!-- Provide a longer summary of what this model is. -->
21
-
22
-
23
-
24
- - **Developed by:** [More Information Needed]
25
- - **Funded by [optional]:** [More Information Needed]
26
- - **Shared by [optional]:** [More Information Needed]
27
- - **Model type:** [More Information Needed]
28
- - **Language(s) (NLP):** [More Information Needed]
29
- - **License:** [More Information Needed]
30
- - **Finetuned from model [optional]:** [More Information Needed]
31
 
32
- ### Model Sources [optional]
 
 
 
 
 
33
 
34
- <!-- Provide the basic links for the model. -->
35
-
36
- - **Repository:** [More Information Needed]
37
- - **Paper [optional]:** [More Information Needed]
38
- - **Demo [optional]:** [More Information Needed]
39
 
 
 
 
40
  ## Uses
41
 
42
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
43
-
44
  ### Direct Use
45
 
46
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
 
 
 
47
 
48
- [More Information Needed]
 
49
 
50
- ### Downstream Use [optional]
51
 
52
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
53
 
54
- [More Information Needed]
 
 
55
 
56
  ### Out-of-Scope Use
57
 
58
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
59
 
60
- [More Information Needed]
 
 
 
 
 
 
61
 
62
  ## Bias, Risks, and Limitations
63
 
64
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
65
 
66
- [More Information Needed]
 
 
 
67
 
68
- ### Recommendations
69
 
70
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
71
 
72
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 
 
73
 
74
  ## How to Get Started with the Model
75
 
76
- Use the code below to get started with the model.
77
 
78
- [More Information Needed]
 
79
 
80
- ## Training Details
81
 
82
- ### Training Data
 
83
 
84
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
85
 
86
- [More Information Needed]
 
 
87
 
88
- ### Training Procedure
 
89
 
90
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
91
 
92
- #### Preprocessing [optional]
 
 
93
 
94
- [More Information Needed]
95
 
 
96
 
97
- #### Training Hyperparameters
 
98
 
99
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
100
 
101
- #### Speeds, Sizes, Times [optional]
 
 
 
 
102
 
103
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
104
 
105
- [More Information Needed]
106
 
107
- ## Evaluation
 
108
 
109
- <!-- This section describes the evaluation protocols and provides the results. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
 
111
  ### Testing Data, Factors & Metrics
112
 
113
  #### Testing Data
114
 
115
- <!-- This should link to a Dataset Card if possible. -->
116
-
117
- [More Information Needed]
118
 
119
  #### Factors
120
 
121
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
122
 
123
- [More Information Needed]
 
 
124
 
125
  #### Metrics
126
 
127
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
128
-
129
- [More Information Needed]
130
 
131
  ### Results
132
 
133
- [More Information Needed]
134
-
135
- #### Summary
136
-
137
 
 
 
 
138
 
139
- ## Model Examination [optional]
140
-
141
- <!-- Relevant interpretability work for the model goes here -->
142
-
143
- [More Information Needed]
144
-
145
- ## Environmental Impact
146
-
147
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
148
-
149
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
150
 
151
- - **Hardware Type:** [More Information Needed]
152
- - **Hours used:** [More Information Needed]
153
- - **Cloud Provider:** [More Information Needed]
154
- - **Compute Region:** [More Information Needed]
155
- - **Carbon Emitted:** [More Information Needed]
156
 
157
- ## Technical Specifications [optional]
158
 
159
  ### Model Architecture and Objective
160
 
161
- [More Information Needed]
 
 
 
162
 
163
  ### Compute Infrastructure
164
 
165
- [More Information Needed]
166
-
167
  #### Hardware
168
 
169
- [More Information Needed]
 
170
 
171
  #### Software
172
 
173
- [More Information Needed]
 
 
 
 
174
 
175
- ## Citation [optional]
176
 
177
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
178
 
179
  **BibTeX:**
180
 
181
- [More Information Needed]
 
 
 
 
 
 
 
 
 
182
 
183
  **APA:**
184
 
185
- [More Information Needed]
186
 
187
- ## Glossary [optional]
188
 
189
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
 
 
 
190
 
191
- [More Information Needed]
192
 
193
- ## More Information [optional]
 
194
 
195
- [More Information Needed]
196
 
197
- ## Model Card Authors [optional]
 
 
 
198
 
199
- [More Information Needed]
 
 
200
 
201
  ## Model Card Contact
202
 
203
- [More Information Needed]
 
 
204
  ### Framework versions
205
 
206
  - PEFT 0.18.0
 
2
  base_model: Qwen/Qwen3-4B
3
  library_name: peft
4
  tags:
5
+ - base_model:adapter:Qwen/Qwen3-4B
6
+ - lora
7
+ - transformers
8
+ - text-classification
9
+ - moderation
10
+ - new-zealand
11
  ---
12
 
13
+ # Model Card for geoffmunn/Qwen3Guard-NewZealand-Classification-4B
 
 
 
14
 
15
+ This is a fine-tuned version of Qwen3-4B using LoRA (Low-Rank Adaptation) to classify whether user-provided text is related to New Zealand or not.
16
+ The model acts as a domain-specific content classifier, returning one of two labels: `"related"` or `"not_related"`.
17
+ It was developed as part of the Qwen3Guard demonstration project to showcase how large language models can be adapted for custom classification tasks.
18
 
19
  ## Model Details
20
 
21
  ### Model Description
22
 
23
+ This model is a binary sequence classifier fine-tuned on a synthetic dataset of New Zealand-related questions and general non-New Zealand text.
24
+ Built atop the Qwen3-4B foundation model, it uses parameter-efficient fine-tuning via LoRA to adapt the model for topic detection in conversational or input text.
25
+ It is designed for use in moderation systems where filtering based on geographic, cultural, or national topics like New Zealand is desired.
 
 
 
 
 
 
 
 
26
 
27
+ - **Developed by:** Geoff Munn (@geoffmunn )
28
+ - **Shared by:** Geoff Munn
29
+ - **Model type:** Causal language model with LoRA adapter for sequence classification
30
+ - **Language(s) (NLP):** English
31
+ - **License:** MIT License (see GitHub repo )
32
+ - **Finetuned from model:** Qwen/Qwen3-4B
33
 
34
+ ### Model Sources
 
 
 
 
35
 
36
+ - **Repository:** https://github.com/geoffmunn/Qwen3Guard
37
+ - **Demo:** Interactive demo available via `new_zealand_chat.html` in the repository; requires local API server
38
+
39
  ## Uses
40
 
 
 
41
  ### Direct Use
42
 
43
+ The model can directly classify whether a given piece of text is related to _New Zealand_. Example applications include:
44
+
45
+ - Filtering travel forum posts
46
+ - Moderating tourism or education chatbots
47
+ - Enhancing region-specific AI assistants (e.g., for NZ government or tourism services)
48
+ - Educational or cultural awareness tools focused on New Zealand
49
 
50
+ Input: A string of text
51
+ Output: One of two labels — `"related"` or `"not_related"`
52
 
53
+ ### Downstream Use
54
 
55
+ This model can be integrated into larger systems such as:
56
 
57
+ - Themed conversational agents (e.g., a _New Zealand_-focused travel advisor)
58
+ - Content routing engines that classify user queries by geographic relevance
59
+ - Fine-tuning starter for other country/region-specific classifiers using similar methodology
60
 
61
  ### Out-of-Scope Use
62
 
63
+ This model should not be used for:
64
 
65
+ - General content moderation (toxicity, hate speech, etc.)
66
+ - Medical, legal, or safety-critical decision-making
67
+ - Multilingual classification (trained only on English)
68
+ - Detecting nuanced sentiment or emotion
69
+ - Classifying topics outside geography, culture, or national identity without retraining
70
+
71
+ It may produce inaccurate classifications when presented with ambiguous place names (e.g., "Auckland" in California), metaphorical language, or topics only tangentially related to New Zealand.
72
 
73
  ## Bias, Risks, and Limitations
74
 
75
+ The training data consists entirely of synthetically generated questions about _New Zealand_, which introduces several limitations:
76
 
77
+ - Potential overfitting to question formats rather than natural language statements
78
+ - Limited coverage of Māori language or te reo phrases (trained on English only)
79
+ - Uneven representation of regions (e.g., more focus on major cities like Auckland or Wellington)
80
+ - Biases toward well-known landmarks, history, or pop culture (e.g., _Lord of the Rings_) over lesser-known local topics
81
 
82
+ Additionally, because the dataset was auto-generated using prompts, there may be inconsistencies in labeling or artificial phrasing patterns.
83
 
84
+ ### Recommendations
85
 
86
+ Users should validate performance on real-world data before deployment.
87
+ For production use, consider augmenting the dataset with human-labeled examples and testing across diverse inputs (including Māori terms, regional slang, and edge cases).
88
+ Always pair this model with broader safeguards if used in public-facing applications.
89
 
90
  ## How to Get Started with the Model
91
 
92
+ You can load and run inference using Hugging Face Transformers:
93
 
94
+ ```python
95
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
96
 
97
+ model_id = "geoffmunn/Qwen3Guard-NewZealand-Classification-4B"
98
 
99
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
100
+ model = AutoModelForSequenceClassification.from_pretrained(model_id)
101
 
102
+ input_text = "What is the capital city of New Zealand?"
103
+ inputs = tokenizer(input_text, return_tensors="pt", truncation=True, max_length=512)
104
 
105
+ outputs = model(**inputs)
106
+ predicted_class_id = outputs.logits.argmax().item()
107
+ label = model.config.id2label[predicted_class_id]
108
 
109
+ print(f"Label: {label}")
110
+ ```
111
 
112
+ Ensure you have the required libraries installed:
113
 
114
+ ```bash
115
+ pip install transformers torch peft
116
+ ```
117
 
118
+ ## Training Details
119
 
120
+ ### Training Data
121
 
122
+ The model was trained on a synthetic JSONL dataset containing 2,500 labeled examples of New Zealand-related questions marked as `"related"`, and an equal number of randomly sampled general knowledge questions labeled `"not_related"`.
123
+ The dataset was generated using a custom script `generate_new_zealand_questions.py` from the repository.
124
 
125
+ Dataset format:
126
 
127
+ ```json
128
+ {"input": "Where is Fiordland National Park located?", "label": "related"}
129
+ {"input": "Who painted the Mona Lisa?", "label": "not_related"}
130
+ ```
131
+ Place your dataset at: `finetuning/new_zealand/new_zealand_guard_dataset.jsonl`
132
 
133
+ ### Training Procedure
134
 
135
+ #### Preprocessing
136
 
137
+ Text inputs were tokenized using the Qwen3 tokenizer with a maximum sequence length of 512 tokens.
138
+ Inputs longer than this were truncated. Labels were mapped via:
139
 
140
+ ```python
141
+ label2id = {"not_related": 0, "related": 1}
142
+ id2label = {0: "not_related", 1: "related"}
143
+ ```
144
+
145
+ #### Training Hyperparameters
146
+
147
+ - **Training regime:** Mixed precision training (fp16), enabled via Hugging Face Accelerate
148
+ - **Batch size:** 2 (per GPU)
149
+ - **Gradient accumulation steps:** 16 → effective batch size: 32
150
+ - **Number of epochs:** 3
151
+ - **Learning rate:** 2e-4
152
+ - **Optimizer:** AdamW
153
+ - **Max sequence length:** 512
154
+ - **LoRA configuration:**
155
+ - **Rank (r):** 16
156
+ - **Alpha:** 32
157
+ - **Dropout:** 0.05
158
+ - **Target modules:** attention query/value layers and MLP up/down projections
159
+
160
+ #### Speeds, Sizes, Times
161
+
162
+ - **Hardware used:** NVIDIA GPU (assumed: A100 or equivalent)
163
+ - **Training time:** ~2–3 hours depending on hardware
164
+ - **Checkpoint size:** ~3.8 GB (adapter weights only, PEFT format)
165
+ - **Inference memory:** < 10 GB VRAM (with quantization further reduction possible)
166
+
167
+ ## Evaluation
168
 
169
  ### Testing Data, Factors & Metrics
170
 
171
  #### Testing Data
172
 
173
+ A 10% holdout test set (~500 samples) was used for evaluation, split from the full dataset during training.
 
 
174
 
175
  #### Factors
176
 
177
+ Evaluation focused on accuracy across:
178
 
179
+ - Well-known vs. obscure NZ locations or facts
180
+ - Question vs. statement format
181
+ - Use of local terms (e.g., "Kiwi", "All Blacks", "Te Reo")
182
 
183
  #### Metrics
184
 
185
+ - Accuracy: Primary metric
186
+ - Precision, Recall, F1-score: Per-class metrics reported during training
187
+ - Confusion Matrix: Generated internally during test phase
188
 
189
  ### Results
190
 
191
+ During final evaluation, the model achieved:
 
 
 
192
 
193
+ - Accuracy: ~96–98% (on synthetic test set)
194
+ - Strong precision/recall for "related" class
195
+ - Minor false positives on topics involving other Southern Hemisphere countries (e.g., Australia) or general travel queries
196
 
197
+ #### Summary
 
 
 
 
 
 
 
 
 
 
198
 
199
+ The model performs well on its intended task within the scope of the training distribution but may degrade on edge cases, ambiguous geography, or culturally nuanced references.
 
 
 
 
200
 
201
+ ## Technical Specifications
202
 
203
  ### Model Architecture and Objective
204
 
205
+ - **Base architecture:** Qwen3-4B (causal decoder-only LLM)
206
+ - **Adaptation method:** LoRA (PEFT)
207
+ - **Task head:** Sequence classification (single-label)
208
+ - **Objective function:** Cross-entropy loss
209
 
210
  ### Compute Infrastructure
211
 
 
 
212
  #### Hardware
213
 
214
+ GPU: NVIDIA A100 / RTX 3090 / L40S or equivalent
215
+ RAM: ≥ 32 GB system memory recommended
216
 
217
  #### Software
218
 
219
+ - Python 3.10+
220
+ - PyTorch 2.4+ with CUDA 12.1+
221
+ - Transformers 4.40+
222
+ - PEFT 0.18.0
223
+ - Accelerate, Datasets, Tokenizers
224
 
225
+ ## Citation
226
 
227
+ While no formal paper exists, please cite the GitHub repository if used academically.
228
 
229
  **BibTeX:**
230
 
231
+ ```bibtex
232
+ @software{munn_qwen3guard_2025,
233
+ author = {Munn, Geoff},
234
+ title = {Qwen3Guard: Demonstration of Qwen3Guard Models for Content Classification},
235
+ year = {2025},
236
+ publisher = {GitHub},
237
+ journal = {GitHub repository},
238
+ url = {https://github.com/geoffmunn/Qwen3Guard}
239
+ }
240
+ ```
241
 
242
  **APA:**
243
 
244
+ Munn, G. (2025). Qwen3Guard: Demonstration of Qwen3Guard Models for Content Classification [Software]. GitHub. https://github.com/geoffmunn/Qwen3Guard
245
 
246
+ ## Glossary
247
 
248
+ - **LoRA (Low-Rank Adaptation):** A parameter-efficient fine-tuning technique that adds trainable low-rank matrices to pretrained weights.
249
+ - **PEFT:** Parameter-Efficient Fine-Tuning, a Hugging Face library for lightweight adaptation of large models.
250
+ - **GGUF:** Format used for running models in llama.cpp; not supported for streaming variant here.
251
+ - **JSONL:** JSON Lines format – one JSON object per line.
252
 
253
+ ## More Information
254
 
255
+ For more details, including API server setup and web demos, visit:
256
+ 👉 https://github.com/geoffmunn/Qwen3Guard
257
 
258
+ Includes:
259
 
260
+ - Ollama-compatible scripts
261
+ - Flask-based API server (`api_server.py`)
262
+ - HTML chat interface (`new_zealand_chat.html`)
263
+ - Dataset generation tools
264
 
265
+ ## Model Card Authors
266
+
267
+ Geoff Munn – Developer and maintainer
268
 
269
  ## Model Card Contact
270
 
271
+ For questions or feedback, contact the author via GitHub:
272
+ @geoffmunn
273
+
274
  ### Framework versions
275
 
276
  - PEFT 0.18.0