|
|
--- |
|
|
base_model: Qwen/Qwen3-4B |
|
|
library_name: peft |
|
|
tags: |
|
|
- base_model:adapter:Qwen/Qwen3-4B |
|
|
- lora |
|
|
- transformers |
|
|
- text-classification |
|
|
- moderation |
|
|
- new-zealand |
|
|
--- |
|
|
|
|
|
# Model Card for geoffmunn/Qwen3Guard-NewZealand-Classification-4B |
|
|
|
|
|
This is a fine-tuned version of Qwen3-4B using LoRA (Low-Rank Adaptation) to classify whether user-provided text is related to New Zealand or not. |
|
|
The model acts as a domain-specific content classifier, returning one of two labels: `"related"` or `"not_related"`. |
|
|
It was developed as part of the Qwen3Guard demonstration project to showcase how large language models can be adapted for custom classification tasks. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
This model is a binary sequence classifier fine-tuned on a synthetic dataset of New Zealand-related questions and general non-New Zealand text. |
|
|
Built atop the Qwen3-4B foundation model, it uses parameter-efficient fine-tuning via LoRA to adapt the model for topic detection in conversational or input text. |
|
|
It is designed for use in moderation systems where filtering based on geographic, cultural, or national topics like New Zealand is desired. |
|
|
|
|
|
- **Developed by:** Geoff Munn (@geoffmunn ) |
|
|
- **Shared by:** Geoff Munn |
|
|
- **Model type:** Causal language model with LoRA adapter for sequence classification |
|
|
- **Language(s) (NLP):** English |
|
|
- **License:** MIT License (see GitHub repo ) |
|
|
- **Finetuned from model:** Qwen/Qwen3-4B |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Repository:** https://github.com/geoffmunn/Qwen3Guard |
|
|
- **Demo:** Interactive demo available via `new_zealand_chat.html` in the repository; requires local API server |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
The model can directly classify whether a given piece of text is related to _New Zealand_. Example applications include: |
|
|
|
|
|
- Filtering travel forum posts |
|
|
- Moderating tourism or education chatbots |
|
|
- Enhancing region-specific AI assistants (e.g., for NZ government or tourism services) |
|
|
- Educational or cultural awareness tools focused on New Zealand |
|
|
|
|
|
Input: A string of text |
|
|
Output: One of two labels β `"related"` or `"not_related"` |
|
|
|
|
|
### Downstream Use |
|
|
|
|
|
This model can be integrated into larger systems such as: |
|
|
|
|
|
- Themed conversational agents (e.g., a _New Zealand_-focused travel advisor) |
|
|
- Content routing engines that classify user queries by geographic relevance |
|
|
- Fine-tuning starter for other country/region-specific classifiers using similar methodology |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
This model should not be used for: |
|
|
|
|
|
- General content moderation (toxicity, hate speech, etc.) |
|
|
- Medical, legal, or safety-critical decision-making |
|
|
- Multilingual classification (trained only on English) |
|
|
- Detecting nuanced sentiment or emotion |
|
|
- Classifying topics outside geography, culture, or national identity without retraining |
|
|
|
|
|
It may produce inaccurate classifications when presented with ambiguous place names (e.g., "Auckland" in California), metaphorical language, or topics only tangentially related to New Zealand. |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
The training data consists entirely of synthetically generated questions about _New Zealand_, which introduces several limitations: |
|
|
|
|
|
- Potential overfitting to question formats rather than natural language statements |
|
|
- Limited coverage of MΔori language or te reo phrases (trained on English only) |
|
|
- Uneven representation of regions (e.g., more focus on major cities like Auckland or Wellington) |
|
|
- Biases toward well-known landmarks, history, or pop culture (e.g., _Lord of the Rings_) over lesser-known local topics |
|
|
|
|
|
Additionally, because the dataset was auto-generated using prompts, there may be inconsistencies in labeling or artificial phrasing patterns. |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
Users should validate performance on real-world data before deployment. |
|
|
For production use, consider augmenting the dataset with human-labeled examples and testing across diverse inputs (including MΔori terms, regional slang, and edge cases). |
|
|
Always pair this model with broader safeguards if used in public-facing applications. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
You can load and run inference using Hugging Face Transformers: |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
|
|
|
model_id = "geoffmunn/Qwen3Guard-NewZealand-Classification-4B" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForSequenceClassification.from_pretrained(model_id) |
|
|
|
|
|
input_text = "What is the capital city of New Zealand?" |
|
|
inputs = tokenizer(input_text, return_tensors="pt", truncation=True, max_length=512) |
|
|
|
|
|
outputs = model(**inputs) |
|
|
predicted_class_id = outputs.logits.argmax().item() |
|
|
label = model.config.id2label[predicted_class_id] |
|
|
|
|
|
print(f"Label: {label}") |
|
|
``` |
|
|
|
|
|
Ensure you have the required libraries installed: |
|
|
|
|
|
```bash |
|
|
pip install transformers torch peft |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
The model was trained on a synthetic JSONL dataset containing 2,500 labeled examples of New Zealand-related questions marked as `"related"`, and an equal number of randomly sampled general knowledge questions labeled `"not_related"`. |
|
|
The dataset was generated using a custom script `generate_new_zealand_questions.py` from the repository. |
|
|
|
|
|
Dataset format: |
|
|
|
|
|
```json |
|
|
{"input": "Where is Fiordland National Park located?", "label": "related"} |
|
|
{"input": "Who painted the Mona Lisa?", "label": "not_related"} |
|
|
``` |
|
|
Place your dataset at: `finetuning/new_zealand/new_zealand_guard_dataset.jsonl` |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
#### Preprocessing |
|
|
|
|
|
Text inputs were tokenized using the Qwen3 tokenizer with a maximum sequence length of 512 tokens. |
|
|
Inputs longer than this were truncated. Labels were mapped via: |
|
|
|
|
|
```python |
|
|
label2id = {"not_related": 0, "related": 1} |
|
|
id2label = {0: "not_related", 1: "related"} |
|
|
``` |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- **Training regime:** Mixed precision training (fp16), enabled via Hugging Face Accelerate |
|
|
- **Batch size:** 2 (per GPU) |
|
|
- **Gradient accumulation steps:** 16 β effective batch size: 32 |
|
|
- **Number of epochs:** 3 |
|
|
- **Learning rate:** 2e-4 |
|
|
- **Optimizer:** AdamW |
|
|
- **Max sequence length:** 512 |
|
|
- **LoRA configuration:** |
|
|
- **Rank (r):** 16 |
|
|
- **Alpha:** 32 |
|
|
- **Dropout:** 0.05 |
|
|
- **Target modules:** attention query/value layers and MLP up/down projections |
|
|
|
|
|
#### Speeds, Sizes, Times |
|
|
|
|
|
- **Hardware used:** NVIDIA GPU (assumed: A100 or equivalent) |
|
|
- **Training time:** ~2β3 hours depending on hardware |
|
|
- **Checkpoint size:** ~3.8 GB (adapter weights only, PEFT format) |
|
|
- **Inference memory:** < 10 GB VRAM (with quantization further reduction possible) |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
|
|
#### Testing Data |
|
|
|
|
|
A 10% holdout test set (~500 samples) was used for evaluation, split from the full dataset during training. |
|
|
|
|
|
#### Factors |
|
|
|
|
|
Evaluation focused on accuracy across: |
|
|
|
|
|
- Well-known vs. obscure NZ locations or facts |
|
|
- Question vs. statement format |
|
|
- Use of local terms (e.g., "Kiwi", "All Blacks", "Te Reo") |
|
|
|
|
|
#### Metrics |
|
|
|
|
|
- Accuracy: Primary metric |
|
|
- Precision, Recall, F1-score: Per-class metrics reported during training |
|
|
- Confusion Matrix: Generated internally during test phase |
|
|
|
|
|
### Results |
|
|
|
|
|
During final evaluation, the model achieved: |
|
|
|
|
|
- Accuracy: ~96β98% (on synthetic test set) |
|
|
- Strong precision/recall for "related" class |
|
|
- Minor false positives on topics involving other Southern Hemisphere countries (e.g., Australia) or general travel queries |
|
|
|
|
|
#### Summary |
|
|
|
|
|
The model performs well on its intended task within the scope of the training distribution but may degrade on edge cases, ambiguous geography, or culturally nuanced references. |
|
|
|
|
|
## Technical Specifications |
|
|
|
|
|
### Model Architecture and Objective |
|
|
|
|
|
- **Base architecture:** Qwen3-4B (causal decoder-only LLM) |
|
|
- **Adaptation method:** LoRA (PEFT) |
|
|
- **Task head:** Sequence classification (single-label) |
|
|
- **Objective function:** Cross-entropy loss |
|
|
|
|
|
### Compute Infrastructure |
|
|
|
|
|
#### Hardware |
|
|
|
|
|
GPU: NVIDIA A100 / RTX 3090 / L40S or equivalent |
|
|
RAM: β₯ 32 GB system memory recommended |
|
|
|
|
|
#### Software |
|
|
|
|
|
- Python 3.10+ |
|
|
- PyTorch 2.4+ with CUDA 12.1+ |
|
|
- Transformers 4.40+ |
|
|
- PEFT 0.18.0 |
|
|
- Accelerate, Datasets, Tokenizers |
|
|
|
|
|
## Citation |
|
|
|
|
|
While no formal paper exists, please cite the GitHub repository if used academically. |
|
|
|
|
|
**BibTeX:** |
|
|
|
|
|
```bibtex |
|
|
@software{munn_qwen3guard_2025, |
|
|
author = {Munn, Geoff}, |
|
|
title = {Qwen3Guard: Demonstration of Qwen3Guard Models for Content Classification}, |
|
|
year = {2025}, |
|
|
publisher = {GitHub}, |
|
|
journal = {GitHub repository}, |
|
|
url = {https://github.com/geoffmunn/Qwen3Guard} |
|
|
} |
|
|
``` |
|
|
|
|
|
**APA:** |
|
|
|
|
|
Munn, G. (2025). Qwen3Guard: Demonstration of Qwen3Guard Models for Content Classification [Software]. GitHub. https://github.com/geoffmunn/Qwen3Guard |
|
|
|
|
|
## Glossary |
|
|
|
|
|
- **LoRA (Low-Rank Adaptation):** A parameter-efficient fine-tuning technique that adds trainable low-rank matrices to pretrained weights. |
|
|
- **PEFT:** Parameter-Efficient Fine-Tuning, a Hugging Face library for lightweight adaptation of large models. |
|
|
- **GGUF:** Format used for running models in llama.cpp; not supported for streaming variant here. |
|
|
- **JSONL:** JSON Lines format β one JSON object per line. |
|
|
|
|
|
## More Information |
|
|
|
|
|
For more details, including API server setup and web demos, visit: |
|
|
π https://github.com/geoffmunn/Qwen3Guard |
|
|
|
|
|
Includes: |
|
|
|
|
|
- Ollama-compatible scripts |
|
|
- Flask-based API server (`api_server.py`) |
|
|
- HTML chat interface (`new_zealand_chat.html`) |
|
|
- Dataset generation tools |
|
|
|
|
|
## Model Card Authors |
|
|
|
|
|
Geoff Munn β Developer and maintainer |
|
|
|
|
|
## Model Card Contact |
|
|
|
|
|
For questions or feedback, contact the author via GitHub: |
|
|
@geoffmunn |
|
|
|
|
|
### Framework versions |
|
|
|
|
|
- PEFT 0.18.0 |