BERT Fine-tuned on AG News for 4-Class Text Classification
Model Description
This model fine-tunes the bert-base-uncased model on the AG News dataset for text classification.
It classifies English news articles into one of four categories:
| Label | English Category | Korean Label |
|---|---|---|
| 0 | World | 세계뉴스 |
| 1 | Sports | 스포츠 |
| 2 | Business | 비즈니스 |
| 3 | Science/Technology | 과학/기술 |
Intended Uses & Limitations
Intended Uses
- News article categorization
- Content tagging for media organizations
- NLP education and experimentation
Limitations
- Trained only on English text → may not generalize to other languages
- May contain biases from AG News dataset
- Requires further evaluation for production or fairness tasks
Training Details
- Base model:
bert-base-uncased - Dataset: AG News
- Task: Text classification (4 classes)
- Tokenizer:
BertTokenizer(uncased, max length 128) - Optimizer: AdamW
- Loss function: CrossEntropyLoss
- Batch size: 16
- Epochs: 3
- Learning rate: 2e-5
Evaluation
| Metric | Score |
|---|---|
| Accuracy | 82.5% |
Usage Example
from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch
model_name = "your-username/bert-agnews-korean-labels" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name)
text = "Apple announces the new iPhone at their annual event." inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) outputs = model(**inputs) predicted_class = torch.argmax(outputs.logits).item()
label_map = { 0: "세계뉴스", 1: "스포츠", 2: "비즈니스", 3: "과학/기술" }
print("Predicted label:", label_map[predicted_class])
Ethical Considerations
- May reflect label bias inherent in AG News dataset.
- Should not be used for misinformation or surveillance.
- Verify performance before high-impact use cases.
Citation
If you use this model, please cite the base paper: @article{devlin2018bert, title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding}, author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina}, journal={arXiv preprint arXiv:1810.04805}, year={2018} }
- Downloads last month
- 4
Model tree for kmjun3203/bert-based-uncased-argnews4-v01
Base model
google-bert/bert-base-uncased