VISION: Prompting Ocean Vertical Velocity Reconstruction from Incomplete Observations

VISION: Prompting Ocean Vertical Velocity Reconstruction from Incomplete Observations
Yuan Gao^†, Hao Wu^†, Qingsong Wen, Kun Wang, Xian Wu, Xiaomeng Huang
† Equal contribution

Abstract: Reconstructing subsurface ocean dynamics, such as vertical velocity fields, from incomplete surface observations poses a critical challenge in Earth science, a field long hampered by the lack of standardized, analysis-ready benchmarks. To systematically address this issue and catalyze research, we first build and release KD48, a high-resolution ocean dynamics benchmark derived from petascale simulations and curated with expert-driven denoising. Building on this benchmark, we introduce VISION, a novel reconstruction paradigm based on Dynamic Prompting designed to tackle the core problem of missing data in real-world observations. The essence of VISION lies in its ability to generate a visual prompt on-the-fly from any available subset of observations, which encodes both data availability and the ocean's physical state. More importantly, we design a State-conditioned Prompting module that efficiently injects this prompt into a universal backbone, endowed with geometry- and scale-aware operators, to guide its adaptive adjustment of computational strategies. This mechanism enables VISION to precisely handle the challenges posed by varying input combinations. Extensive experiments on the KD48 benchmark demonstrate that VISION not only substantially outperforms state-of-the-art models but also exhibits strong generalization under extreme data missing scenarios. By providing a high-quality benchmark and a robust model, our work establishes a solid infrastructure for ocean science research under data uncertainty. Our codes are available at: ~\url{https://github.com/YuanGao-YG/VISION}.

News 🚀

2025.09.25: Inference codes, pre-trained weights, and demo data of KD48 benchmark are released.
2025.09.25: Paper is released on arXiv.

Notes

The intact project is avilable on Hugging Face, you can find the pretrained models, test data on Hugging Face and put them in the same location.

KD48 Benchmark

Quick Start

Installation

cuda 11.8

# git clone this repository
git clone https://github.com/YuanGao-YG/VISION.git
cd VISION

# create new anaconda env
conda env create -f environment.yml
conda activate vision