---
license: apache-2.0
tags:
- depth-estimation
- computer-vision
- monocular-depth
- multi-view-geometry
- pose-estimation
library_name: depth-anything-3
pipeline_tag: depth-estimation
---
# Depth Anything 3: DA3METRIC-LARGE
[](https://depth-anything-3.github.io)
[](https://arxiv.org/abs/)
[](https://huggingface.co/spaces/depth-anything/Depth-Anything-3) # noqa: E501
## Model Description
DA3 Metric Large model specialized for metric depth estimation in monocular settings, ideal for applications requiring real-world scale. Canonical metric depth; multiplying by focal length gives metric depth.
| Property | Value |
|----------|-------|
| **Model Series** | Monocular Metric Depth |
| **Parameters** | 0.35B |
| **License** | Apache 2.0 |
## Capabilities
- โ
Relative Depth
- โ
Metric Depth
- โ
Sky Segmentation
## Quick Start
### Installation
```bash
git clone https://github.com/ByteDance-Seed/depth-anything-3
cd depth-anything-3
pip install -e .
```
### Basic Example
```python
import torch
from depth_anything_3.api import DepthAnything3
# Load model from Hugging Face Hub
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = DepthAnything3.from_pretrained("depth-anything/da3metric-large")
model = model.to(device=device)
# Run inference on images
images = ["image1.jpg", "image2.jpg"] # List of image paths, PIL Images, or numpy arrays
prediction = model.inference(
images,
export_dir="output",
export_format="glb" # Options: glb, npz, ply, mini_npz, gs_ply, gs_video
)
# Access results
print(prediction.depth.shape) # Depth maps: [N, H, W] float32
print(prediction.conf.shape) # Confidence maps: [N, H, W] float32
print(prediction.extrinsics.shape) # Camera poses (w2c): [N, 3, 4] float32
print(prediction.intrinsics.shape) # Camera intrinsics: [N, 3, 3] float32
```
### Command Line Interface
```bash
# Process images with auto mode
da3 auto path/to/images \
--export-format glb \
--export-dir output \
--model-dir depth-anything/da3metric-large
# Use backend for faster repeated inference
da3 backend --model-dir depth-anything/da3metric-large
da3 auto path/to/images --export-format glb --use-backend
```
## Model Details
- **Developed by:** ByteDance Seed Team
- **Model Type:** Vision Transformer for Visual Geometry
- **Architecture:** Plain transformer with unified depth-ray representation
- **Training Data:** Public academic datasets only
### Key Insights
๐ A **single plain transformer** (e.g., vanilla DINO encoder) is sufficient as a backbone without architectural specialization. # noqa: E501
โจ A singular **depth-ray representation** obviates the need for complex multi-task learning.
## Performance
๐ Depth Anything 3 significantly outperforms:
- **Depth Anything 2** for monocular depth estimation
- **VGGT** for multi-view depth estimation and pose estimation
For detailed benchmarks, please refer to our [paper](https://depth-anything-3.github.io). # noqa: E501
## Limitations
- The model is trained on academic datasets and may have limitations on certain domain-specific images # noqa: E501
- Performance may vary depending on image quality, lighting conditions, and scene complexity
## Citation
If you find Depth Anything 3 useful in your research or projects, please cite:
```bibtex
@article{depthanything3,
title={Depth Anything 3: Recovering the visual space from any views},
author={Haotong Lin and Sili Chen and Jun Hao Liew and Donny Y. Chen and Zhenyu Li and Guang Shi and Jiashi Feng and Bingyi Kang}, # noqa: E501
journal={arXiv preprint arXiv:XXXX.XXXXX},
year={2025}
}
```
## Links
- ๐ [Project Page](https://depth-anything-3.github.io)
- ๐ [Paper](https://arxiv.org/abs/)
- ๐ป [GitHub Repository](https://github.com/ByteDance-Seed/depth-anything-3)
- ๐ค [Hugging Face Demo](https://huggingface.co/spaces/depth-anything/Depth-Anything-3)
- ๐ [Documentation](https://github.com/ByteDance-Seed/depth-anything-3#-useful-documentation)
## Authors
[Haotong Lin](https://haotongl.github.io/) ยท [Sili Chen](https://github.com/SiliChen321) ยท [Junhao Liew](https://liewjunhao.github.io/) ยท [Donny Y. Chen](https://donydchen.github.io) ยท [Zhenyu Li](https://zhyever.github.io/) ยท [Guang Shi](https://scholar.google.com/citations?user=MjXxWbUAAAAJ&hl=en) ยท [Jiashi Feng](https://scholar.google.com.sg/citations?user=Q8iay0gAAAAJ&hl=en) ยท [Bingyi Kang](https://bingykang.github.io/) # noqa: E501