Improve model card for Label Anything: Add metadata, links, abstract, and usage

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +91 -4
README.md CHANGED
@@ -2,9 +2,96 @@
2
  tags:
3
  - model_hub_mixin
4
  - pytorch_model_hub_mixin
 
 
5
  ---
6
 
7
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
8
- - Library: [More Information Needed]
9
- - Docs: [More Information Needed]
10
- - ArXiv: https://arxiv.org/abs/2407.02075
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  tags:
3
  - model_hub_mixin
4
  - pytorch_model_hub_mixin
5
+ license: mit
6
+ pipeline_tag: image-segmentation
7
  ---
8
 
9
+ # 🏷️ Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts
10
+
11
+ **Label Anything** introduces a novel transformer-based architecture designed for multi-prompt, multi-way few-shot semantic segmentation, significantly reducing annotation burden while maintaining high accuracy.
12
+
13
+ [![Paper](https://img.shields.io/badge/Paper-2407.02075-b31b1b.svg)](https://huggingface.co/papers/2407.02075)
14
+ [![Project Page](https://img.shields.io/badge/🌐_Project-Page-blue.svg)](https://pasqualedem.github.io/LabelAnything/)
15
+ [![arXiv](https://img.shields.io/badge/arXiv-2407.02075-b31b1b.svg)](https://arxiv.org/abs/2407.02075)
16
+ [![GitHub](https://img.shields.io/badge/GitHub-Code-keygen.svg?logo=github&style=flat-square)](https://github.com/pasqualedem/LabelAnything)
17
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/pasqualedem/LabelAnything/blob/main/LICENSE)
18
+
19
+ ## Abstract
20
+ Few-shot semantic segmentation aims to segment objects from previously unseen classes using only a limited number of labeled examples. In this paper, we introduce Label Anything, a novel transformer-based architecture designed for multi-prompt, multi-way few-shot semantic segmentation. Our approach leverages diverse visual prompts -- points, bounding boxes, and masks -- to create a highly flexible and generalizable framework that significantly reduces annotation burden while maintaining high accuracy. Label Anything makes three key contributions: ($\textit{i}$) we introduce a new task formulation that relaxes conventional few-shot segmentation constraints by supporting various types of prompts, multi-class classification, and enabling multiple prompts within a single image; ($\textit{ii}$) we propose a novel architecture based on transformers and attention mechanisms; and ($\textit{iii}$) we design a versatile training procedure allowing our model to operate seamlessly across different $N$-way $K$-shot and prompt-type configurations with a single trained model. Our extensive experimental evaluation on the widely used COCO-$20^i$ benchmark demonstrates that Label Anything achieves state-of-the-art performance among existing multi-way few-shot segmentation methods, while significantly outperforming leading single-class models when evaluated in multi-class settings.
21
+
22
+ ## Overview
23
+ **Label Anything** is a novel method for multi-class few-shot semantic segmentation using visual prompts. This repository contains the official implementation of our ECAI 2025 paper, enabling precise segmentation with just a few prompted examples.
24
+
25
+ <div align="center">
26
+ <img src="https://github.com/pasqualedem/LabelAnything/raw/main/assets/la.png" alt="Label Anything Demo" width="70%">
27
+ <em>Visual prompting meets few-shot learning with a new fast and efficient architecture.</em>
28
+ </div>
29
+
30
+ This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration.
31
+
32
+ ## ✨ Key Features
33
+ - **🎯 Few-Shot Learning**: Achieve remarkable results with minimal training data.
34
+ - **πŸ–ΌοΈ Visual Prompting**: Intuitive interaction through visual cues (points, bounding boxes, masks).
35
+ - **⚑ Multi-GPU Support**: Accelerated training on modern hardware.
36
+ - **πŸ”„ Cross-Validation**: Robust 4-fold evaluation protocol.
37
+ - **πŸ“Š Rich Logging**: Comprehensive experiment tracking with Weights & Biases.
38
+ - **πŸ€— HuggingFace Integration**: Seamless model sharing and deployment.
39
+
40
+ ## πŸš€ How to Use
41
+
42
+ ### ⚑ One-Line Demo
43
+ Experience Label Anything instantly with our streamlined demo:
44
+
45
+ ```bash
46
+ uvx --from git+https://github.com/pasqualedem/LabelAnything app
47
+ ```
48
+
49
+ > **πŸ’‘ Pro Tip**: This command uses [uv](https://docs.astral.sh/uv/) for lightning-fast package management and execution.
50
+
51
+ ### πŸ”Œ Model Loading (Python)
52
+ You can load a pre-trained model as follows:
53
+
54
+ ```python
55
+ from label_anything.models import LabelAnything
56
+
57
+ # Load pre-trained model, e.g., "pasqualedem/label_anything_sam_1024_coco"
58
+ model = LabelAnything.from_pretrained("pasqualedem/label_anything_sam_1024_coco")
59
+ ```
60
+
61
+ For detailed usage, including manual installation and the training pipeline, please refer to the [official GitHub repository](https://github.com/pasqualedem/LabelAnything).
62
+
63
+ ## πŸ“¦ Pre-trained Models
64
+ Access our collection of state-of-the-art checkpoints:
65
+
66
+ <div align="center">
67
+
68
+ | 🧠 Encoder | πŸ“ Embedding Size | πŸ–ΌοΈ Image Size | πŸ“ Fold | πŸ”— Checkpoint |
69
+ |------------|-------------------|----------------|----------|---------------|
70
+ | **SAM** | 512 | 1024 | - | [![HF](https://img.shields.io/badge/πŸ€—_HuggingFace-Model-FFD21E?style=for-the-badge)](https://huggingface.co/pasqualedem/label_anything_sam_1024_coco) |
71
+ | **ViT-MAE** | 256 | 480 | - | [![HF](https://img.shields.io/badge/πŸ€—_HuggingFace-Model-FFD21E?style=for-the-badge)](https://huggingface.co/pasqualedem/label_anything_mae_480_coco) |
72
+ | **ViT-MAE** | 256 | 480 | 0 | [![HF](https://img.shields.io/badge/πŸ€—_HuggingFace-Model-FFD21E?style=for-the-badge)](https://huggingface.co/pasqualedem/label_anything_coco_fold0_mae_7a5p0t63) |
73
+
74
+ </div>
75
+
76
+ ## πŸ“„ Citation
77
+ If you find Label Anything useful in your research, please cite our work:
78
+
79
+ ```bibtex
80
+ @inproceedings{labelanything2025,
81
+ title={LabelAnything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts},
82
+ author={De Marinis, Pasquale and Fanelli, Nicola and Scaringi, Raffaele and Colonna, Emanuele and Fiameni, Giuseppe and Vessio, Gennaro and Castellano, Giovanna},
83
+ booktitle={ECAI 2025},
84
+ year={2025}
85
+ }
86
+ ```
87
+
88
+ ## πŸ“œ License
89
+ This project is licensed under the MIT License - see the [LICENSE](https://github.com/pasqualedem/LabelAnything/blob/main/LICENSE) file for details.
90
+
91
+ ---
92
+
93
+ <div align="center">
94
+
95
+ **Made with ❀️ by the CilabUniba Label Anything Team**
96
+
97
+ </div>