Abstract
Distributed stream big data analytics platforms have emerged to tackle the continuously generated data streams. In stream big data analytics, the data processing workflow is abstracted as a directed graph referred to as a topology. Data are read from the storage and processed tuple by tuple, and these processing results are updated dynamically. The performance of a topology is evaluated by its throughput. This paper proposes an efficient resource allocation scheme for a heterogeneous shared stream big data analytics cluster shared by multiple topologies, in order to achieve max-min fairness in the utilities of the throughput for all the topologies. We first formulate a novel model resource allocation problem, which is a mixed 0-1 integer program. The NP-hardness of the problem is rigorously proven. To tackle this problem, we transform the non-convex constraint to several linear constraints using linearization and reformulation techniques. Based on the analysis of the problem-specific structure and characteristics, we propose an approach that iteratively solves the continuous problem with a fixed set of discrete variables optimally, and updates the discrete variables heuristically. Simulations show that our proposed resource allocation scheme remarkably improves the max-min fairness in utilities of the topology throughput, and is low in computational complexity.
| Original language | English |
|---|---|
| Pages (from-to) | 130-137 |
| Journal | IEEE Transactions on Big Data |
| Volume | 4 |
| DOIs | |
| Publication status | Published - Mar 2018 |
Fingerprint
Dive into the research topics of 'Towards Max-Min Fair Resource Allocation for Stream Big Data Analytics in Shared Clouds'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver