---
library_name: transformers
license: cc-by-nc-sa-4.0
base_model: microsoft/layoutlmv3-base
tags:
- layoutlmv3
- document-ai
- invoices
- token-classification
- pytorch
metrics:
- accuracy
- precision
- recall
- f1
model-index:
- name: layoutlmv3-finetuned-invoices
  results:
  - task:
      type: token-classification
      name: Named Entity Recognition (Invoices)
    dataset:
      name: Custom Invoice Dataset
      type: invoices
      split: test
    metrics:
      - name: Precision
        type: precision
        value: 0.9037
      - name: Recall
        type: recall
        value: 0.8871
      - name: F1
        type: f1
        value: 0.8954
---

# layoutlmv3-finetuned-invoices

This model is a fine-tuned version of [microsoft/layoutlmv3-base](https://huggingface.co/microsoft/layoutlmv3-base) on a **custom invoice dataset** for document understanding tasks.  
It was trained using the Hugging Face `Trainer` API with early stopping and mixed precision.

## Results

The model achieves the following results on the held-out validation set after 10 epochs:

| Epoch | Train Loss | Val Loss | Precision | Recall | F1     |
|-------|------------|----------|-----------|--------|--------|
| 1     | 0.8329     | 0.7094   | 0.7184    | 0.6524 | 0.6838 |
| 5     | 0.3815     | 0.3104   | 0.8625    | 0.8559 | 0.8592 |
| 8     | 0.2988     | 0.2350   | 0.8999    | 0.8803 | 0.8900 |
| 10    | 0.2499     | 0.2254   | **0.9037**| 0.8872 | **0.8954** |

---

## Model description

- **Architecture**: LayoutLMv3  
- **Base model**: microsoft/layoutlmv3-base  
- **Task**: Token classification for invoice understanding (e.g., extracting key fields).  
- **Input**: Scanned invoices (images + text tokens + bounding boxes).  
- **Output**: Predicted entity labels (e.g., Invoice Number, Date, Vendor, Total).  

---

## Intended uses & limitations

- **Use cases**:  
  - Automatic information extraction from invoices, receipts, and financial documents.  
  - Document AI pipelines for expense management and automation.  

- **Limitations**:  
  - Fine-tuned only on a limited invoice dataset.  
  - May not generalize well to other document types (contracts, ID cards, etc.).  
  - Sensitive to OCR quality — better input text = better results.  

---

## Training details

### Hyperparameters
- Learning rate: `3e-5`  
- Train batch size: `4`  
- Eval batch size: `4`  
- Epochs: `10` (with early stopping)  
- Optimizer: AdamW  
- Weight decay: `0.01`  
- Mixed precision (fp16): ✅  
- Workers: `2`  

### Framework versions
- Transformers 4.56.0  
- PyTorch 2.8.0+cu126  
- Datasets 4.0.0  
- Tokenizers 0.22.0  

---

## How to use

```python
from transformers import AutoModelForTokenClassification, AutoProcessor

repo_id = "your-username/layoutlmv3-finetuned-invoices"

model = AutoModelForTokenClassification.from_pretrained(repo_id)
processor = AutoProcessor.from_pretrained(repo_id)

# Example inference
from PIL import Image
image = Image.open("sample_invoice.png")

encoding = processor(image, return_tensors="pt")
outputs = model(**encoding)