nielsr's picture
nielsr HF Staff
Improve model card for Label Anything: Add metadata, links, abstract, and usage
f73e0a5 verified
|
raw
history blame
5.96 kB
metadata
tags:
  - model_hub_mixin
  - pytorch_model_hub_mixin
license: mit
pipeline_tag: image-segmentation

🏷️ Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts

Label Anything introduces a novel transformer-based architecture designed for multi-prompt, multi-way few-shot semantic segmentation, significantly reducing annotation burden while maintaining high accuracy.

Paper Project Page arXiv GitHub License: MIT

Abstract

Few-shot semantic segmentation aims to segment objects from previously unseen classes using only a limited number of labeled examples. In this paper, we introduce Label Anything, a novel transformer-based architecture designed for multi-prompt, multi-way few-shot semantic segmentation. Our approach leverages diverse visual prompts -- points, bounding boxes, and masks -- to create a highly flexible and generalizable framework that significantly reduces annotation burden while maintaining high accuracy. Label Anything makes three key contributions: ($\textit{i}$) we introduce a new task formulation that relaxes conventional few-shot segmentation constraints by supporting various types of prompts, multi-class classification, and enabling multiple prompts within a single image; ($\textit{ii}$) we propose a novel architecture based on transformers and attention mechanisms; and ($\textit{iii}$) we design a versatile training procedure allowing our model to operate seamlessly across different $N$-way $K$-shot and prompt-type configurations with a single trained model. Our extensive experimental evaluation on the widely used COCO-$20^i$ benchmark demonstrates that Label Anything achieves state-of-the-art performance among existing multi-way few-shot segmentation methods, while significantly outperforming leading single-class models when evaluated in multi-class settings.

Overview

Label Anything is a novel method for multi-class few-shot semantic segmentation using visual prompts. This repository contains the official implementation of our ECAI 2025 paper, enabling precise segmentation with just a few prompted examples.

Label Anything Demo Visual prompting meets few-shot learning with a new fast and efficient architecture.

This model has been pushed to the Hub using the PytorchModelHubMixin integration.

✨ Key Features

  • 🎯 Few-Shot Learning: Achieve remarkable results with minimal training data.
  • πŸ–ΌοΈ Visual Prompting: Intuitive interaction through visual cues (points, bounding boxes, masks).
  • ⚑ Multi-GPU Support: Accelerated training on modern hardware.
  • πŸ”„ Cross-Validation: Robust 4-fold evaluation protocol.
  • πŸ“Š Rich Logging: Comprehensive experiment tracking with Weights & Biases.
  • πŸ€— HuggingFace Integration: Seamless model sharing and deployment.

πŸš€ How to Use

⚑ One-Line Demo

Experience Label Anything instantly with our streamlined demo:

uvx --from git+https://github.com/pasqualedem/LabelAnything app

πŸ’‘ Pro Tip: This command uses uv for lightning-fast package management and execution.

πŸ”Œ Model Loading (Python)

You can load a pre-trained model as follows:

from label_anything.models import LabelAnything

# Load pre-trained model, e.g., "pasqualedem/label_anything_sam_1024_coco"
model = LabelAnything.from_pretrained("pasqualedem/label_anything_sam_1024_coco")

For detailed usage, including manual installation and the training pipeline, please refer to the official GitHub repository.

πŸ“¦ Pre-trained Models

Access our collection of state-of-the-art checkpoints:

🧠 Encoder πŸ“ Embedding Size πŸ–ΌοΈ Image Size πŸ“ Fold πŸ”— Checkpoint
SAM 512 1024 - HF
ViT-MAE 256 480 - HF
ViT-MAE 256 480 0 HF

πŸ“„ Citation

If you find Label Anything useful in your research, please cite our work:

@inproceedings{labelanything2025,
  title={LabelAnything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts},
  author={De Marinis, Pasquale and Fanelli, Nicola and Scaringi, Raffaele and Colonna, Emanuele and Fiameni, Giuseppe and Vessio, Gennaro and Castellano, Giovanna},
  booktitle={ECAI 2025},
  year={2025}
}

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with ❀️ by the CilabUniba Label Anything Team