Abstract
The popular methods for semi-supervised semantic segmentation mostly adopt a unitary network model using convolutional neural networks (CNNs) and enforce consistency of the model's predictions over perturbations applied to the inputs or model. However, such a learning paradigm suffers from two critical limitations: a) learning the discriminative features for the unlabeled data; b) learning both global and local information from the whole image. In this paper, we propose a novel Semi-supervised Learning (SSL) approach, called Transformer-CNN Cohort (TCC), that consists of two students with one based on the vision transformer (ViT) and the other based on the CNN. Our method subtly incorporates the multi-level consistency regularization on the predictions and the heterogeneous feature spaces via pseudo-labeling for the unlabeled data. First, as the inputs of the ViT student are image patches, the feature maps extracted encode crucial class-wise statistics. To this end, we propose class-aware feature consistency distillation (CFCD) that first leverages the outputs of each student as the pseudo labels and generates class-aware feature (CF) maps for knowledge transfer between the two students. Second, as the ViT student has more uniform representations for all layers, we propose consistency-aware cross distillation (CCD) to transfer knowledge between the pixel-wise predictions from the cohort. We validate the TCC framework on Cityscapes and Pascal VOC 2012 datasets, which outperforms existing SSL methods by a large margin. Project page: https://vlislab22.github.io/TCC/.
| Original language | English |
|---|---|
| Title of host publication | 2024 IEEE International Conference on Robotics and Automation, ICRA 2024 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 11147-11154 |
| Number of pages | 8 |
| ISBN (Electronic) | 9798350384574 |
| DOIs | |
| Publication status | Published - 2024 |
| Event | 2024 IEEE International Conference on Robotics and Automation, ICRA 2024 - Yokohama, Japan Duration: 13 May 2024 → 17 May 2024 |
Publication series
| Name | Proceedings - IEEE International Conference on Robotics and Automation |
|---|---|
| ISSN (Print) | 1050-4729 |
Conference
| Conference | 2024 IEEE International Conference on Robotics and Automation, ICRA 2024 |
|---|---|
| Country/Territory | Japan |
| City | Yokohama |
| Period | 13/05/24 → 17/05/24 |
Bibliographical note
Publisher Copyright:© 2024 IEEE.
Fingerprint
Dive into the research topics of 'Transformer-CNN Cohort: Semi-supervised Semantic Segmentation by the Best of Both Students'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver