DNN Partitioning and Assignment for Distributed Inference in SGX Empowered Edge Cloud

Yuepeng Li, Deze Zeng*, Lin Gut, Song Guo, Albert Y. Zomaya

*Corresponding author for this work

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

2 Citations (Scopus)

Abstract

Distributed Deep Neural Network (DNN) inference is a promising technology to explore the distributed resources in edge cloud to realize edge intelligence. Meanwhile the inherent resource sharing nature of edge cloud infrastructure also raises serious concerns on security and privacy. Software Guard Ex-tensions (SGX) emerges as a potential hardware-level solution but its limited secure memory (i.e., enclave page cache) imposes new challenges, especially in contrast to memory-hungry DNN models. A task's performance will be severely affected when its memory footprint is beyond the enclave page cache size, due to expensive secure page swapping. In this case, how to appropriately partition a DNN model and assign the partitions to distributed edge servers to efficiently utilize edge resources for fast secure inference becomes a challenging problem. In this paper, we first show that this problem is NP-hard. We further propose a MEmory -aware Distributed Inference Acceleration (MEDIA) algorithm, whose guaranteed approximation ratio is also formally analyzed. We have implemented a prototype system and applied some well-known representative DNN models to evaluate MEDIA's performance. Through extensive experiments, we verify the efficiency of MEDIA by the fact that it reduces the inference time by 19.5%-38.1 % in comparison with state-of-the-art approaches.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE 44th International Conference on Distributed Computing Systems, ICDCS 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages635-644
Number of pages10
ISBN (Electronic)9798350386059
Publication statusPublished - 2024
Externally publishedYes
Event44th IEEE International Conference on Distributed Computing Systems, ICDCS 2024 - Jersey City, United States
Duration: 23 Jul 202426 Jul 2024

Publication series

NameProceedings - International Conference on Distributed Computing Systems
ISSN (Print)1063-6927
ISSN (Electronic)2575-8411

Conference

Conference44th IEEE International Conference on Distributed Computing Systems, ICDCS 2024
Country/TerritoryUnited States
CityJersey City
Period23/07/2426/07/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Keywords

  • DNN
  • Distributed inference
  • SGX

Fingerprint

Dive into the research topics of 'DNN Partitioning and Assignment for Distributed Inference in SGX Empowered Edge Cloud'. Together they form a unique fingerprint.

Cite this