FILMUnet2D
This model is a 2D U-Net with FiLM conditioning for Ultrasound multi-organ segmentation.
Installation
Make sure you have transformers and torch installed.
pip install transformers torch
Usage
You can load the model and processor using the Auto classes from transformers. Since this repository contains custom code, make sure to pass trust_remote_code=True.
import torch
from transformers import AutoModel, AutoImageProcessor
from PIL import Image
# 1. Load model and processor
repo_id = "AImageLab-Zip/US_FiLMUNet"
processor = AutoImageProcessor.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModel.from_pretrained(repo_id, trust_remote_code=True)
model.eval()
# 2. Load and preprocess your image
# The processor handles resizing, letterboxing, and normalization.
image = Image.open("path/to/your/image.png").convert("RGB")
inputs = processor(images=image, return_tensors="pt")
# 3. Prepare conditioning input
# This should be an integer tensor representing the organ ID.
# Replace `4` with the appropriate ID for your use case.
organ_id = torch.tensor([4])
# 4. Run inference
with torch.no_grad():
outputs = model(**inputs, organ_id=organ_id)
# 5. Post-process the output to get the final segmentation mask
# The processor can convert the logits to a binary mask, automatically handling
# the removal of letterbox padding and resizing to the original image dimensions.
mask = processor.post_process_semantic_segmentation(
outputs,
inputs,
threshold=0.7,
return_as_pil=True
)[0]
# 6. Save the result
mask.save("output_mask.png")
print("Segmentation mask saved to output_mask.png")
Model Details
- Architecture: U-Net with FiLM layers for conditional segmentation.
- Conditioning: The model's output is conditioned on an
organ_idinput. - Input: RGB images.
- Output: A single-channel segmentation mask.
Configuration
The model configuration can be accessed via model.config. Key parameters include:
in_channels: Number of input channels (default: 3).num_classes: Number of output classes (default: 1).n_organs: The number of different organs the model was trained to condition on.depth: The depth of the U-Net.size: The base number of filters in the first layer.
Organ IDs
The organ_id passed to the model corresponds to the following mapping:
organ_to_class_dict = {
"appendix": 0,
"breast": 1,
"breast_luminal": 1,
"cardiac": 2,
"thyroid": 3,
"fetal": 4,
"kidney": 5,
"liver": 6,
"testicle": 7,
}
Alternative Versions
This repository contains multiple versions of the model located in subfolders. You can load a specific version by using the subfolder parameter.
4-Stage U-Net
This version has a U-Net depth of 4.
from transformers import AutoModel
model_4_stages = AutoModel.from_pretrained(
"AImageLab-Zip/US_FiLMUNet",
subfolder="unet_4_stages",
trust_remote_code=True
)
- Downloads last month
- 61