--- library_name: transformers license: cc-by-nc-sa-4.0 base_model: microsoft/layoutlmv3-base tags: - layoutlmv3 - document-ai - invoices - token-classification - pytorch metrics: - accuracy - precision - recall - f1 model-index: - name: layoutlmv3-finetuned-invoices results: - task: type: token-classification name: Named Entity Recognition (Invoices) dataset: name: Custom Invoice Dataset type: invoices split: test metrics: - name: Precision type: precision value: 0.9037 - name: Recall type: recall value: 0.8871 - name: F1 type: f1 value: 0.8954 --- # layoutlmv3-finetuned-invoices This model is a fine-tuned version of [microsoft/layoutlmv3-base](https://huggingface.co/microsoft/layoutlmv3-base) on a **custom invoice dataset** for document understanding tasks. It was trained using the Hugging Face `Trainer` API with early stopping and mixed precision. ## Results The model achieves the following results on the held-out validation set after 10 epochs: | Epoch | Train Loss | Val Loss | Precision | Recall | F1 | |-------|------------|----------|-----------|--------|--------| | 1 | 0.8329 | 0.7094 | 0.7184 | 0.6524 | 0.6838 | | 5 | 0.3815 | 0.3104 | 0.8625 | 0.8559 | 0.8592 | | 8 | 0.2988 | 0.2350 | 0.8999 | 0.8803 | 0.8900 | | 10 | 0.2499 | 0.2254 | **0.9037**| 0.8872 | **0.8954** | --- ## Model description - **Architecture**: LayoutLMv3 - **Base model**: microsoft/layoutlmv3-base - **Task**: Token classification for invoice understanding (e.g., extracting key fields). - **Input**: Scanned invoices (images + text tokens + bounding boxes). - **Output**: Predicted entity labels (e.g., Invoice Number, Date, Vendor, Total). --- ## Intended uses & limitations - **Use cases**: - Automatic information extraction from invoices, receipts, and financial documents. - Document AI pipelines for expense management and automation. - **Limitations**: - Fine-tuned only on a limited invoice dataset. - May not generalize well to other document types (contracts, ID cards, etc.). - Sensitive to OCR quality — better input text = better results. --- ## Training details ### Hyperparameters - Learning rate: `3e-5` - Train batch size: `4` - Eval batch size: `4` - Epochs: `10` (with early stopping) - Optimizer: AdamW - Weight decay: `0.01` - Mixed precision (fp16): ✅ - Workers: `2` ### Framework versions - Transformers 4.56.0 - PyTorch 2.8.0+cu126 - Datasets 4.0.0 - Tokenizers 0.22.0 --- ## How to use ```python from transformers import AutoModelForTokenClassification, AutoProcessor repo_id = "your-username/layoutlmv3-finetuned-invoices" model = AutoModelForTokenClassification.from_pretrained(repo_id) processor = AutoProcessor.from_pretrained(repo_id) # Example inference from PIL import Image image = Image.open("sample_invoice.png") encoding = processor(image, return_tensors="pt") outputs = model(**encoding)