Abstract
The rapid advancement of Vision Transformer (ViT) models has greatly enhanced performance in computer vision tasks. However, deploying ViTs in resource-constrained environments presents a challenge as attention computation forms a bottleneck, necessitating extensive memory and computation resources. To address this issue, we propose HSViT, a dedicated hardware and software co-design framework specified for ViT. HSViT introduces a configurable and efficient accelerator with dedicated dataflows that takes advantage of the multi-level compression, including feature map compression, token pruning and hardware-friendly sparsity. The proposed accelerator reduces intermediate transmission for feature maps and Query, Key, and Value matrices while enhancing data reuse and processing element utilization for chain matrix multiplications. Moreover, an innovative Top-k engine, integrated into the accelerator, is presented to support various selection scenarios with high speed and low resource consumption. Experiments validate that the proposed HSViT delivers significant speedups of 123.91×, 29.5×, and 3.01 ∼ 20.65× over conventional CPUs, GPUs, and prior arts, respectively. HSViT also achieves the throughput of up to 731.5 GOP/s and PE utilization as high as 92%.
| Original language | English |
|---|---|
| Title of host publication | ISCAS 2024 - IEEE International Symposium on Circuits and Systems |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| ISBN (Electronic) | 9798350330991 |
| DOIs | |
| Publication status | Published - 2024 |
| Externally published | Yes |
| Event | 2024 IEEE International Symposium on Circuits and Systems, ISCAS 2024 - Singapore, Singapore Duration: 19 May 2024 → 22 May 2024 |
Publication series
| Name | Proceedings - IEEE International Symposium on Circuits and Systems |
|---|---|
| ISSN (Print) | 0271-4310 |
Conference
| Conference | 2024 IEEE International Symposium on Circuits and Systems, ISCAS 2024 |
|---|---|
| Country/Territory | Singapore |
| City | Singapore |
| Period | 19/05/24 → 22/05/24 |
Bibliographical note
Publisher Copyright:© 2024 IEEE.
Keywords
- FPGA
- Hardware and Software Co-design
- Model Compression
- Vision Transformer
Fingerprint
Dive into the research topics of 'HSViT: A Hardware and Software Collaborative Design for Vision Transformer via Multi-level Compression'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver