Abstract
Distributed Deep Neural Network (DNN) inference is a promising technology to explore the distributed resources in edge cloud to realize edge intelligence. Meanwhile the inherent resource sharing nature of edge cloud infrastructure also raises serious concerns on security and privacy. Software Guard Ex-tensions (SGX) emerges as a potential hardware-level solution but its limited secure memory (i.e., enclave page cache) imposes new challenges, especially in contrast to memory-hungry DNN models. A task's performance will be severely affected when its memory footprint is beyond the enclave page cache size, due to expensive secure page swapping. In this case, how to appropriately partition a DNN model and assign the partitions to distributed edge servers to efficiently utilize edge resources for fast secure inference becomes a challenging problem. In this paper, we first show that this problem is NP-hard. We further propose a MEmory -aware Distributed Inference Acceleration (MEDIA) algorithm, whose guaranteed approximation ratio is also formally analyzed. We have implemented a prototype system and applied some well-known representative DNN models to evaluate MEDIA's performance. Through extensive experiments, we verify the efficiency of MEDIA by the fact that it reduces the inference time by 19.5%-38.1 % in comparison with state-of-the-art approaches.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2024 IEEE 44th International Conference on Distributed Computing Systems, ICDCS 2024 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 635-644 |
| Number of pages | 10 |
| ISBN (Electronic) | 9798350386059 |
| Publication status | Published - 2024 |
| Externally published | Yes |
| Event | 44th IEEE International Conference on Distributed Computing Systems, ICDCS 2024 - Jersey City, United States Duration: 23 Jul 2024 → 26 Jul 2024 |
Publication series
| Name | Proceedings - International Conference on Distributed Computing Systems |
|---|---|
| ISSN (Print) | 1063-6927 |
| ISSN (Electronic) | 2575-8411 |
Conference
| Conference | 44th IEEE International Conference on Distributed Computing Systems, ICDCS 2024 |
|---|---|
| Country/Territory | United States |
| City | Jersey City |
| Period | 23/07/24 → 26/07/24 |
Bibliographical note
Publisher Copyright:© 2024 IEEE.
Keywords
- DNN
- Distributed inference
- SGX