EdgeSAM - Efficient Segment Anything Model
EdgeSAM is an accelerated variant of the Segment Anything Model (SAM) optimized for edge devices using ONNX Runtime.
Model Files
edge_sam_3x_encoder.onnx- Image encoder (1024x1024 input)edge_sam_3x_decoder.onnx- Mask decoder with prompt support
Usage
API Request Format
import requests
import base64
# Encode your image
with open("image.jpg", "rb") as f:
image_b64 = base64.b64encode(f.read()).decode()
# Make request
response = requests.post(
"https://YOUR-ENDPOINT-URL",
json={
"inputs": image_b64,
"parameters": {
"point_coords": [[512, 512]], # Click point in 1024x1024 space
"point_labels": [1], # 1 = foreground, 0 = background
"return_mask_image": True
}
}
)
result = response.json()
Response Format
[
{
"mask_shape": [1024, 1024],
"has_object": true,
"mask": "<base64_encoded_png>"
}
]
Parameters
- point_coords: Array of
[x, y]coordinates in 1024x1024 space (optional) - point_labels: Array of labels (1=foreground, 0=background) corresponding to points (optional)
- box_coords: Bounding box
[x1, y1, x2, y2](optional, not yet implemented) - return_mask_image: Return base64-encoded PNG mask (default:
true)
Coordinate System
All coordinates should be in 1024x1024 space, regardless of original image size. The handler automatically resizes input images to 1024x1024 before processing.
Example: For a click at the center of any image, use [512, 512].
Local Testing
# Install dependencies
pip install -r requirements.txt
# Run test script
python test_handler.py
This will create:
test_input.png- Test image with red circletest_output_mask.png- Generated segmentation masktest_output_overlay.png- Overlay visualization
Technical Details
- Input: RGB images (auto-resized to 1024x1024)
- Preprocessing: Normalized to [0, 1] range (
/ 255.0) - Hardware: Supports CUDA GPU with automatic CPU fallback
- Framework: ONNX Runtime Web compatible
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support