Logchain: Cloud workflow reconstruction & troubleshooting with unstructured logs

Pengpeng Zhou*, Yang Wang, Zhenyu Li, Gareth Tyson, Hongtao Guan, Gaogang Xie

*Corresponding author for this work

Research output: Contribution to journalJournal Articlepeer-review

6 Citations (Scopus)

Abstract

Cloud-based virtualization has become a key part of building distributed applications. One of its many benefits is the ability to dynamically manage system capacity by creating, deleting and migrating virtual machines (VMs) on-demand. This management process, however, depends on complex pipelines, involving multiple services invocations across distributed nodes. This makes troubleshooting and debugging difficult, as these complex pipelines lack an integrated logging system. Instead, each service generates independent and unstructured log messages without the ability to link logs into a single integrated workflow. We present LogChain, a tool that gathers and processes distributed unstructured logs to diagnose failures in cloud management tasks. It contains three key functions: (i) It infers task workflows from distributed unstructured logs; (ii) it labels these workflows with the tasks that triggered them; and (iii) it diagnoses potential failures in the workflow's execution, to support administrator with troubleshooting. We evaluate LogChain with realistic workloads, and show that it exceeds the state-of-the-art in terms of performance and accuracy.

Original languageEnglish
Article number107279
JournalComputer Networks
Volume175
DOIs
Publication statusPublished - 5 Jul 2020
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2020

Fingerprint

Dive into the research topics of 'Logchain: Cloud workflow reconstruction & troubleshooting with unstructured logs'. Together they form a unique fingerprint.

Cite this