--- license: apache-2.0 pipeline_tag: image-to-3d tags: - dino - scene-understanding - semantic-scene-completion - unsupervised library_name: pytorch ---

Feed-Forward SceneDINO for Unsupervised Semantic Scene Completion

[**Aleksandar Jevtić**](https://jev-aleks.github.io/)^*1 [**Christoph Reich**](https://christophreich1996.github.io/)^*1,2,4,5 [**Felix Wimbauer**](https://fwmb.github.io/)^1,4 [**Oliver Hahn**](https://olvrhhn.github.io/)² [**Christian Rupprecht**](https://chrirupp.github.io/)³ [**Stefan Roth**](https://www.visinf.tu-darmstadt.de/visual_inference/people_vi/stefan_roth.en.jsp)^2,5,6 [**Daniel Cremers**](https://cvg.cit.tum.de/members/cremers/)^1,4,5 ¹TU Munich ²TU Darmstadt ³University of Oxford ⁴MCML ⁵ELIZA ⁶hessian.AI *equal contribution

Project Page URL

Project Page URL

[![Framework](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?&logo=PyTorch&logoColor=white)](https://pytorch.org/)

## Overview SceneDINO is unsupervised and infers 3D geometry and features from a single image in a feed-forward manner. Distilling and clustering SceneDINO's 3D feature field results in unsupervised semantic scene completion predictions. The method is trained using multi-view self-supervision. ## Installation & Quick Start Please refer to our [Github Repo](https://github.com/tum-vision/scenedino). ## Citation If you find our work useful, please consider giving it a star ⭐ and citing our paper. ```bibtex @inproceedings{Jevtic:2025:SceneDINO, author = {Aleksandar Jevti{\'c} and Christoph Reich and Felix Wimbauer and Oliver Hahn and Christian Rupprecht and Stefan Roth and Daniel Cremers}, title = {Feed-Forward {SceneDINO} for Unsupervised Semantic Scene Completion}, journal = {IEEE/CVF International Conference on Computer Vision (ICCV)}, year = {2025}, } ```