Owl: Performance-Aware Scheduling for Resource-Efficient Function-as-a-Service Cloud

Huangshi TIAN, Suyi LI, Ao Wang, Wei WANG, Tianlong Wu, Haoran Yang

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

43 Citations (Scopus)

Abstract

This work documents our experience of improving the scheduler in Alibaba Function Compute, a public FaaS platform. It commences with our observation that memory and CPU are under-utilized in most FaaS sandboxes. A natural solution is to overcommit VM resources when allocating sandboxes, whereas the ensuing contention may cause performance degradation and compromise user experience. To complicate matters, the degradation in FaaS can arise from external factors, such as failed dependencies of user functions. We design Owl to achieve both high utilization and performance stability. It introduces a customizable rule system for users to specify their toleration of degradation, and overcommits resources with a dual approach. (1) For less-invoked functions, it allocates resources to the sandboxes with usage-based heuristic, keeps monitoring their performance, and remedies any detected degradation. It differentiates whether a degraded sandbox is affected externally by separating a contention-free environment and migrating the affected sandbox into there as a comparison baseline. (2) For frequently-invoked functions, Owl profiles the interference patterns among collocated sandboxes and place the sandboxes under the guidance of profiles. The collocation profiling is designed to tackle the constraints that profiling has to be conducted in production. Owl further consolidates idle sandboxes to reduce resource waste. We prototype Owl in our production system and implement a representative benchmark suite to evaluate it. The results demonstrate that the prototype could reduce VM cost by 43.80% and effectively mitigate latency degradation, with negligible overhead incurred.

Original languageEnglish
Title of host publicationSoCC 2022 - Proceedings of the 13th Symposium on Cloud Computing
PublisherAssociation for Computing Machinery, Inc
Pages78-93
Number of pages16
ISBN (Electronic)9781450394147
DOIs
Publication statusPublished - 7 Nov 2022
Event13th Annual ACM Symposium on Cloud Computing, SoCC 2022 - San Francisco, United States
Duration: 7 Nov 202211 Nov 2022

Publication series

NameSoCC 2022 - Proceedings of the 13th Symposium on Cloud Computing

Conference

Conference13th Annual ACM Symposium on Cloud Computing, SoCC 2022
Country/TerritoryUnited States
CitySan Francisco
Period7/11/2211/11/22

Bibliographical note

Publisher Copyright:
© 2022 ACM.

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 12 - Responsible Consumption and Production
    SDG 12 Responsible Consumption and Production

Keywords

  • overcommitment
  • resource-management
  • scheduling
  • serverless

Fingerprint

Dive into the research topics of 'Owl: Performance-Aware Scheduling for Resource-Efficient Function-as-a-Service Cloud'. Together they form a unique fingerprint.

Cite this