摘要详情

ID / 提交时间

187 / 2025-06-13 15:04:50

标题

CDADCLIP: Learning Prompts with Hybrid Semantic Fusion for Few-Shot Anomaly Detection under Domain Shift

关键字

Few-shot, Visual-Language, Domain shift

主题及专题

Special Sessions > SSL 08 New perspective of intelligent detection, diagnosis, prognosis, and maintenance: Generative AI and Industrial Large Model

状态

终稿

作者

Ran An / Xi’an Jiaotong University； Xi’an； PR China； 710049；School of Mechanical Engineering

Jiafeng Tang / Xi’an Jiaotong University；School of Mechanical Engineering

Zhibin Zhao / 西安交通大学；School of Mechanical Engineering

Xuefeng Chen / State Key Laboratory for Manufacturing Systems Engineering Xi’an Jiaotong University

摘要

Few-shot anomaly detection (FSAD) aims to identify anomalies using models trained on minimal samples, a task made particularly challenging in real-world scenarios due to domain shifts caused by variations in lighting conditions, object pose, and other environmental factors. Recently, large pre-trained vision-language models like CLIP have shown promise in FSAD visual tasks. However, most of existing approaches often rely on manually designed prompts to capture anomaly semantics, which are susceptible to environmental interference and labor-intensive to implement. To address this, we propose a cross-domain CLIP for anomaly detection (CDADCLIP) to adapt CLIP for FSAD under conditions with domain shift. CDADCLIP incorporates domain-invariant learnable prompts into CLIP to model normal and abnormal semantics. Furthermore, a Hybrid Semantic Fusion (HSF) module is utilized to enhance anomaly detection performance by integrating region-level information with global features. Experiments result on the AeBAD-S dataset with domain shift demonstrates the superior performance of our method compared with existing state-of-the-art methods.