Abstract
We present a simple, effective, and general activation function we term ACON which learns to activate the neurons or not. Interestingly, we find Swish, the recent popular NAS-searched activation, can be interpreted as a smooth approximation to ReLU. Intuitively, in the same way, we approximate the more general Maxout family to our novel ACON family, which remarkably improves the performance and makes Swish a special case of ACON. Next, we present meta-ACON, which explicitly learns to optimize the parameter switching between non-linear (activate) and linear (inactivate) and provides a new design space. By simply changing the activation function, we show its effectiveness on both small models and highly optimized large models (e.g. it improves the ImageNet top-1 accuracy rate by 6.7% and 1.8% on MobileNet-0.25 and ResNet-152, respectively). Moreover, our novel ACON can be naturally transferred to object detection and semantic segmentation, showing that ACON is an effective alternative in a variety of tasks. Code is available at https://github. com/nmaac/acon.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 |
| Publisher | IEEE Computer Society |
| Pages | 8028-8038 |
| Number of pages | 11 |
| ISBN (Electronic) | 9781665445092 |
| DOIs | |
| Publication status | Published - 2021 |
| Event | 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 - Virtual, Online, United States Duration: 19 Jun 2021 → 25 Jun 2021 |
Publication series
| Name | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
|---|---|
| ISSN (Print) | 1063-6919 |
Conference
| Conference | 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 |
|---|---|
| Country/Territory | United States |
| City | Virtual, Online |
| Period | 19/06/21 → 25/06/21 |
Bibliographical note
Publisher Copyright:© 2021 IEEE