Model Description
This model is finetuned version of GreekBert
Dataset
The model finetuned on the GreekNews-20k dataset.
Results
Perfomance on the GreekNews-20k dataset :
| Class | Precision | Recall | F1-score | Support |
|---|---|---|---|---|
| Αυτοκίνητο | 0.94 | 0.95 | 0.94 | 201 |
| Επιχειρήσεις και βιομηχανία | 0.73 | 0.78 | 0.75 | 369 |
| Έγκλημα και δικαιοσύνη | 0.93 | 0.89 | 0.91 | 314 |
| Ειδήσεις για καταστροφές και έκτακτες ανάγκες | 0.83 | 0.79 | 0.81 | 272 |
| Οικονομικά και χρηματοοικονομικά | 0.78 | 0.74 | 0.76 | 495 |
| Εκπαίδευση | 0.85 | 0.92 | 0.88 | 259 |
| Ψυχαγωγία και πολιτισμός | 0.81 | 0.85 | 0.83 | 251 |
| Περιβάλλον και κλίμα | 0.81 | 0.75 | 0.78 | 292 |
| Οικογένεια και σχέσεις | 0.87 | 0.89 | 0.88 | 294 |
| Μόδα | 0.96 | 0.93 | 0.94 | 259 |
| Τρόφιμα και ποτά | 0.69 | 0.90 | 0.78 | 262 |
| Υγεία και ιατρική | 0.76 | 0.71 | 0.73 | 346 |
| Μεταφορές και υποδομές | 0.78 | 0.86 | 0.82 | 321 |
| Ψυχική υγεία και ευεξία | 0.84 | 0.79 | 0.81 | 348 |
| Πολιτική και κυβέρνηση | 0.89 | 0.69 | 0.78 | 339 |
| Θρησκεία | 0.89 | 0.95 | 0.92 | 271 |
| Αθλητισμός | 1.00 | 0.98 | 0.99 | 212 |
| Ταξίδια και αναψυχή | 0.88 | 0.88 | 0.88 | 424 |
| Τεχνολογία και επιστήμη | 0.77 | 0.78 | 0.78 | 308 |
| accuracy | 0.83 | 5837 | ||
| macro avg | 0.84 | 0.84 | 0.84 | 5837 |
| weighted avg | 0.83 | 0.83 | 0.83 | 5837 |
| Entity | Precision | Recall | F1-score | Support |
|---|---|---|---|---|
| CARDINAL | 0.87 | 0.97 | 0.91 | 25656 |
| DATE | 0.89 | 0.92 | 0.91 | 15469 |
| EVENT | 0.71 | 0.73 | 0.72 | 1720 |
| FAC | 0.53 | 0.60 | 0.56 | 2118 |
| GPE | 0.88 | 0.95 | 0.91 | 16010 |
| LOC | 0.82 | 0.70 | 0.75 | 3547 |
| MONEY | 0.78 | 0.83 | 0.80 | 3882 |
| NORP | 0.91 | 0.92 | 0.91 | 1926 |
| ORDINAL | 0.92 | 0.98 | 0.95 | 3891 |
| ORG | 0.78 | 0.85 | 0.82 | 22184 |
| PERCENT | 0.73 | 0.86 | 0.79 | 7286 |
| PERSON | 0.89 | 0.93 | 0.91 | 16524 |
| PRODUCT | 0.70 | 0.56 | 0.63 | 2071 |
| QUANTITY | 0.74 | 0.76 | 0.75 | 2588 |
| TIME | 0.74 | 0.90 | 0.81 | 2390 |
| micro avg | 0.83 | 0.90 | 0.86 | 127262 |
| macro avg | 0.79 | 0.83 | 0.81 | 127262 |
| weighted avg | 0.84 | 0.90 | 0.86 | 127262 |
Performance on the elNER dataset :
| Entity | Precision | Recall | F1-score | Support |
|---|---|---|---|---|
| CARDINAL | 0.91 | 0.97 | 0.94 | 911 |
| DATE | 0.92 | 0.92 | 0.92 | 838 |
| EVENT | 0.57 | 0.57 | 0.57 | 130 |
| FAC | 0.49 | 0.44 | 0.47 | 77 |
| GPE | 0.84 | 0.95 | 0.89 | 826 |
| LOC | 0.80 | 0.64 | 0.71 | 178 |
| MONEY | 0.98 | 0.98 | 0.98 | 111 |
| NORP | 0.89 | 0.92 | 0.91 | 141 |
| ORDINAL | 0.95 | 0.93 | 0.94 | 172 |
| ORG | 0.81 | 0.79 | 0.80 | 1388 |
| PERCENT | 0.96 | 1.00 | 0.98 | 206 |
| PERSON | 0.93 | 0.95 | 0.94 | 1051 |
| PRODUCT | 0.61 | 0.37 | 0.46 | 83 |
| QUANTITY | 0.76 | 0.78 | 0.77 | 65 |
| TIME | 0.90 | 0.92 | 0.91 | 137 |
| micro avg | 0.87 | 0.88 | 0.87 | 6314 |
| macro avg | 0.82 | 0.81 | 0.81 | 6314 |
| weighted avg | 0.87 | 0.88 | 0.87 | 6314 |
To use this model
pip install transformers, torch
from transformers import AutoModel
model = AutoModel.from_pretrained("katrjohn/GreekNewsBERT", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("nlpaueb/bert-base-greek-uncased-v1")
Example usage
import torch
# Classification label dictionary (reverse)
classification_label_dict_reverse = {
0: "Αυτοκίνητο", 1: "Επιχειρήσεις και βιομηχανία", 2: "Έγκλημα και δικαιοσύνη",
3: "Ειδήσεις για καταστροφές και έκτακτες ανάγκες", 4: "Οικονομικά και χρηματοοικονομικά", 5: "Εκπαίδευση",
6: "Ψυχαγωγία και πολιτισμός", 7: "Περιβάλλον και κλίμα", 8: "Οικογένεια και σχέσεις",
9: "Μόδα", 10: "Τρόφιμα και ποτά", 11: "Υγεία και ιατρική", 12: "Μεταφορές και υποδομές",
13: "Ψυχική υγεία και ευεξία", 14: "Πολιτική και κυβέρνηση", 15: "Θρησκεία",
16: "Αθλητισμός", 17: "Ταξίδια και αναψυχή", 18: "Τεχνολογία και επιστήμη"
}
ner_label_set = ["PAD", "O",
"B-ORG", "I-ORG", "B-PERSON", "I-PERSON", "B-CARDINAL", "I-CARDINAL",
"B-GPE", "I-GPE", "B-DATE", "I-DATE", "B-ORDINAL", "I-ORDINAL",
"B-PERCENT", "I-PERCENT", "B-LOC", "I-LOC", "B-NORP", "I-NORP",
"B-MONEY", "I-MONEY", "B-TIME", "I-TIME", "B-EVENT", "I-EVENT",
"B-PRODUCT", "I-PRODUCT", "B-FAC", "I-FAC", "B-QUANTITY", "I-QUANTITY"
]
tag2idx = {t:i for i,t in enumerate(ner_label_set)}
idx2tag = {i:t for t,i in tag2idx.items()}
sentence = "Ο Κυριάκος Μητσοτάκης επισκέφθηκε τη Θεσσαλονίκη για τα εγκαίνια της ΔΕΘ."
inputs = tokenizer(sentence, return_tensors="pt")
with torch.no_grad():
classification_logits, ner_logits = model(**inputs)
# Classification
classification_probs = torch.softmax(classification_logits, dim=-1)
predicted_class = torch.argmax(classification_probs, dim=-1).item()
predicted_class_label = classification_label_dict_reverse.get(predicted_class, "Unknown")
print(f"Predicted class index: {predicted_class}")
print(f"Predicted class label: {predicted_class_label}")
# NER
ner_predictions = torch.argmax(ner_logits, dim=-1).squeeze().tolist()
tokens = tokenizer.convert_ids_to_tokens(inputs['input_ids'].squeeze())
for token, pred_idx in zip(tokens, ner_predictions):
tag = idx2tag.get(pred_idx, "O")
if token in ["[CLS]", "[SEP]"]:
tag = "O"
print(f"{token}: {tag}")
Output:
Predicted class index: 14
Predicted class label: Πολιτική και κυβέρνηση
[CLS]: O
ο: O
κυριακος: B-PERSON
μητσοτακης: I-PERSON
επισκεφθηκε: O
τη: O
θεσσαλονικη: B-GPE
για: O
τα: O
εγκαινια: O
της: O
δεθ: B-EVENT
.: O
[SEP]: O
Author
This model has been released along side with the article: Named Entity Recognition and News Article Classification: A Lightweight Approach.
To use this model please cite the following:
@ARTICLE{11148234,
author={Katranis, Ioannis and Troussas, Christos and Krouska, Akrivi and Mylonas, Phivos and Sgouropoulou, Cleo},
journal={IEEE Access},
title={Named Entity Recognition and News Article Classification: A Lightweight Approach},
year={2025},
volume={13},
number={},
pages={155031-155046},
keywords={Accuracy;Transformers;Pipelines;Named entity recognition;Computational modeling;Vocabulary;Tagging;Real-time systems;Benchmark testing;Training;Distilled transformer;edge-deployable model;multiclass news-topic classification;named entity recognition},
doi={10.1109/ACCESS.2025.3605709}}
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for katrjohn/GreekNewsBert
Base model
nlpaueb/bert-base-greek-uncased-v1