RuBERT base fine-tuned on ruDEFT and WCL Wiki Ru datasets for NER

The model aims to extract terms and defenitions in a text. Labels:

Term - a word or phrase.
Definition - the span that defines some term.

import torch
import numpy as np
from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("psytechlab/wcl-wiki_rudeft__ner-model")
model = AutoModelForTokenClassification.from_pretrained("psytechlab/wcl-wiki_rudeft__ner-model")
model.eval()

inputs = tokenizer('оромо — это африканская этническая группа, проживающая в эфиопии и в меньшей степени в кении.', return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

logits = outputs.logits
predictions = torch.argmax(logits, dim=-1)[0].tolist() 

tokens = inputs["input_ids"][0]
word_ids = inputs.word_ids(batch_index=0)

word_to_labels = {}
for token_id, word_id, label_id in zip(tokens, word_ids, predictions):
    if word_id is None:
        continue
    if word_id not in word_to_labels:
        word_to_labels[word_id] = []
    word_to_labels[word_id].append(label_id)

word_level_predictions = [model.config.id2label[labels[0]] for labels in word_to_labels.values()]

print(word_level_predictions)
# ['B-Term', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O']

Training procedure

Training

The training was done with Trainier class that has next parameters:

training_args = TrainingArguments(
    eval_strategy="epoch",
    save_strategy="epoch",
    learning_rate=2e-5,
    num_train_epochs=7,
    weight_decay=0.01,
)

Metrics

Metrics on combined set (ruDEFT + WCL Wiki Ru) psytechlab/rus_rudeft_wcl-wiki:

              precision    recall  f1-score   support

I-Definition       0.75      0.90      0.82      3344
B-Definition       0.62      0.73      0.67       230
      I-Term       0.80      0.85      0.82       524
           O       0.97      0.91      0.94     11359
      B-Term       0.96      0.93      0.94      2977

    accuracy                           0.91     18434
   macro avg       0.82      0.87      0.84     18434
weighted avg       0.92      0.91      0.91     18434

Metrics only on astromis/ruDEFT:

              precision    recall  f1-score   support

I-Definition       0.90      0.90      0.90      3344
B-Definition       0.74      0.73      0.74       230
      I-Term       0.83      0.87      0.85       389
           O       0.86      0.86      0.86      2222
      B-Term       0.87      0.85      0.86       638

    accuracy                           0.87      6823
   macro avg       0.84      0.84      0.84      6823
weighted avg       0.87      0.87      0.87      6823

Metrics only on astromis/WCL_Wiki_Ru:

              precision    recall  f1-score   support

I-Definition       0.00      0.00      0.00         0
B-Definition       0.00      0.00      0.00         0
      I-Term       0.72      0.78      0.75       135
           O       1.00      0.93      0.96      9137
      B-Term       0.99      0.95      0.97      2339

    accuracy                           0.93     11611
   macro avg       0.54      0.53      0.54     11611
weighted avg       0.99      0.93      0.96     11611

Citation

`` @article{Popov2025TransferringNL, title={Transferring Natural Language Datasets Between Languages Using Large Language Models for Modern Decision Support and Sci-Tech Analytical Systems}, author={Dmitrii Popov and Egor Terentev and Danil Serenko and Ilya Sochenkov and Igor Buyanov}, journal={Big Data and Cognitive Computing}, year={2025}, url={https://api.semanticscholar.org/CorpusID:278179500} }

Downloads last month: 52

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for psytechlab/wcl-wiki_rudeft__ner-model

Base model

DeepPavlov/rubert-base-cased

Finetuned

(61)

this model

psytechlab
/

wcl-wiki_rudeft__ner-model