starkdv123
/

conll2003-bert-ner-full

Token Classification

Model card Files Files and versions

BERT (base-cased) for CoNLL-2003 NER — Full Fine-Tune

This repository contains a BERT base cased model fine-tuned on CoNLL-2003 (parquet version). Evaluated with seqeval (entity-level F1).

📊 Result (this run)

Entity Macro F1: 0.9192

Usage

from transformers import pipeline
clf = pipeline("token-classification", model="starkdv123/conll2003-bert-ner-full", aggregation_strategy="simple")
clf("Chris Hoiles hit his 22nd homer for Baltimore.")

Training summary

Base: bert-base-cased
Epochs: 3, LR: 3e-5, batch 16/32, max_len 256, weight_decay 0.01, fp16
Label alignment: -100 for subword continuations
Metric: seqeval F1 (entity-level)

Confusion Matrix

         LOC    MISC       O     ORG     PER
   LOC    411       6      21      32       3
  MISC      9    2213      51      76      14
     O     67     110   38063      58      17
   ORG     31      77      32    2353      10
   PER      3      42      15      24    2689

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

·

Dataset used to train starkdv123/conll2003-bert-ner-full