IMAGE CLUSTERING VIA THE PRINCIPLE OF RATE REDUCTION IN THE AGE OF PRE-TRAINED MODELS

Tianzhe Chu, Shengbang Tong, Tianjiao Ding, Xili Dai, Benjamin D. Haeffele, René Vidal, Yi Ma

Research output: Contribution to conferenceConference Paperpeer-review

Abstract

The advent of large pre-trained models has brought about a paradigm shift in both visual representation learning and natural language processing. However, clustering unlabeled images, as a fundamental and classic machine learning problem, still lacks an effective solution, particularly for large-scale datasets. In this paper, we propose a novel image clustering pipeline that leverages the powerful feature representation of large pre-trained models such as CLIP and cluster images effectively and efficiently at scale. We first developed a novel algorithm to estimate the number of clusters in a given dataset. We then show that the pre-trained features are significantly more structured by further optimizing the rate reduction objective. The resulting features may significantly improve the clustering accuracy, e.g., from 57% to 66% on ImageNet-1k. Furthermore, by leveraging CLIP's multimodality bridge between image and text, we develop a simple yet effective self-labeling algorithm that produces meaningful captions for the clusters. Through extensive experiments, we show that our pipeline works well on standard datasets such as CIFAR-10, CIFAR-100, and ImageNet-1k. It also extends to datasets that are not curated for clustering, such as LAION-Aesthetics and WikiArts. We released the code in https://github.com/LeslieTrue/CPP.

Original languageEnglish
DOIs
Publication statusPublished - 2024
Externally publishedYes
Event12th International Conference on Learning Representations, ICLR 2024 - Hybrid, Vienna, Austria
Duration: 7 May 202411 May 2024

Conference

Conference12th International Conference on Learning Representations, ICLR 2024
Country/TerritoryAustria
CityHybrid, Vienna
Period7/05/2411/05/24

Bibliographical note

Publisher Copyright:
© 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.

Fingerprint

Dive into the research topics of 'IMAGE CLUSTERING VIA THE PRINCIPLE OF RATE REDUCTION IN THE AGE OF PRE-TRAINED MODELS'. Together they form a unique fingerprint.

Cite this