Abstract
This paper presents a novel knowledge distillation based model compression framework consisting of a student ensemble. It enables distillation of simultaneously learnt ensemble knowledge onto each of the compressed student models. Each model learns unique representations from the data distribution due to its distinct architecture. This helps the ensemble generalize better by combining every model’s knowledge. The distilled students and ensemble teacher are trained simultaneously without requiring any pretrained weights. Moreover, our proposed method can deliver multi-compressed students with single training, which is efficient and flexible for different scenarios. We provide comprehensive experiments using state-of-the-art classification models to validate our framework’s effectiveness. Notably, using our framework a 97% compressed ResNet110 student model managed to produce a 10.64% relative accuracy gain over its individual baseline training on CIFAR100 dataset. Similarly a 95% compressed DenseNet-BC (k = 12) model managed a 8.17% relative accuracy gain.
| Original language | English |
|---|---|
| Title of host publication | Computer Vision – ECCV 2020 - 16th European Conference, 2020, Proceedings |
| Editors | Andrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 18-35 |
| Number of pages | 18 |
| ISBN (Print) | 9783030585280 |
| DOIs | |
| Publication status | Published - 2020 |
| Externally published | Yes |
| Event | 16th European Conference on Computer Vision, ECCV 2020 - Glasgow, United Kingdom Duration: 23 Aug 2020 → 28 Aug 2020 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 12364 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 16th European Conference on Computer Vision, ECCV 2020 |
|---|---|
| Country/Territory | United Kingdom |
| City | Glasgow |
| Period | 23/08/20 → 28/08/20 |
Bibliographical note
Publisher Copyright:© 2020, Springer Nature Switzerland AG.
Keywords
- Deep model compression
- Ensemble deep model training
- Image classification
- Knowledge distillation