Abstract
The objective of Active Learning is to strategically label a subset of the dataset to maximize performance within a predetermined labeling budget. In this study, we harness features acquired through self-supervised learning. We introduce a straightforward yet potent metric, Cluster Distance Difference, to identify diverse data. Subsequently, we introduce a novel framework, Balancing Active Learning (BAL), which constructs adaptive sub-pools to balance diverse and uncertain data. Our approach outperforms all established active learning methods on widely recognized benchmarks by 1.20%. Moreover, we assess the efficacy of our proposed framework under extended settings, encompassing both larger and smaller labeling budgets. Experimental results demonstrate that, when labeling 80% of the samples, the performance of the current SOTA method declines by 0.74%, whereas our proposed BAL achieves performance comparable to the full dataset.
| Original language | English |
|---|---|
| Article number | 10372131 |
| Pages (from-to) | 3653-3664 |
| Number of pages | 12 |
| Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
| Volume | 46 |
| Issue number | 5 |
| DOIs | |
| Publication status | Published - 1 May 2024 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 1979-2012 IEEE.
Keywords
- Active learning
- computer vision
- contrastive learning