arxiv:2503.23529

ViLAaD: Enhancing "Attracting and Dispersing'' Source-Free Domain Adaptation with Vision-and-Language Model

Published on Mar 30

Authors:

Shuhei Tarashima ,

Abstract

A new method called ViL-enhanced AaD (ViLAaD) integrates Vision-and-Language (ViL) models into Source-Free Domain Adaptation (SFDA) to boost adaptation performance, achieving state-of-the-art results across various SFDA scenarios.

AI-generated summary

Source-Free Domain Adaptation (SFDA) aims to adapt a pre-trained source model to a target dataset from a different domain without access to the source data. Conventional SFDA methods are limited by the information encoded in the pre-trained source model and the unlabeled target data. Recently, approaches leveraging auxiliary resources have emerged, yet remain in their early stages, offering ample opportunities for research. In this work, we propose a novel method that incorporates auxiliary information by extending an existing SFDA framework using Vision-and-Language (ViL) models. Specifically, we build upon Attracting and Dispersing (AaD), a widely adopted SFDA technique, and generalize its core principle to naturally integrate ViL models as a powerful initialization for target adaptation. Our approach, called ViL-enhanced AaD (ViLAaD), preserves the simplicity and flexibility of the AaD framework, while leveraging ViL models to significantly boost adaptation performance. We validate our method through experiments using various ViL models, demonstrating that ViLAaD consistently outperforms both AaD and zero-shot classification by ViL models, especially when both the source model and ViL model provide strong initializations. Moreover, the flexibility of ViLAaD allows it to be seamlessly incorporated into an alternating optimization framework with ViL prompt tuning and extended with additional objectives for target model adaptation. Extensive experiments on four SFDA benchmarks show that this enhanced version, ViLAaD++, achieves state-of-the-art performance across multiple SFDA scenarios, including Closed-set SFDA, Partial-set SFDA, and Open-set SFDA.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2503.23529 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2503.23529 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2503.23529 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.