CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No

Hualiang Wang, Yi Li, Huifeng Yao, Xiaomeng Li*

*Corresponding author for this work

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

88 Citations (Scopus)

Abstract

Out-of-distribution (OOD) detection refers to training the model on an in-distribution (ID) dataset to classify whether the input images come from unknown classes. Considerable effort has been invested in designing various OOD detection methods based on either convolutional neural networks or transformers. However, zero-shot OOD detection methods driven by CLIP, which only require class names for ID, have received less attention. This paper presents a novel method, namely CLIP saying "no"(CLIPN), which empowers the logic of saying "no"within CLIP. Our key motivation is to equip CLIP with the capability of distinguishing OOD and ID samples using positive-semantic prompts and negation-semantic prompts. Specifically, we design a novel learnable "no"prompt and a "no"text encoder to capture negation semantics within images. Subsequently, we introduce two loss functions: the image-text binary-opposite loss and the text semantic-opposite loss, which we use to teach CLIPN to associate images with "no"prompts, thereby enabling it to identify unknown samples. Furthermore, we propose two threshold-free inference algorithms to perform OOD detection by utilizing negation semantics from "no"prompts and the text encoder. Experimental results on 9 benchmark datasets (3 ID datasets and 6 OOD datasets) for the OOD detection task demonstrate that CLIPN, based on ViT-B-16, outperforms 7 well-used algorithms by at least 2.34% and 11.64% in terms of AUROC and FPR95 for zero-shot OOD detection on ImageNet-1K. Our CLIPN can serve as a solid foundation for effectively leveraging CLIP in downstream OOD tasks. The code is available on https://github.com/xmed-lab/CLIPN.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1802-1812
Number of pages11
ISBN (Electronic)9798350307184
DOIs
Publication statusPublished - 2023
Event2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023 - Paris, France
Duration: 2 Oct 20236 Oct 2023

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
ISSN (Print)1550-5499

Conference

Conference2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
Country/TerritoryFrance
CityParis
Period2/10/236/10/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Fingerprint

Dive into the research topics of 'CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No'. Together they form a unique fingerprint.

Cite this