Fair allocation of network resources for data-parallel applications is a challenging undertaking. Conflicts between the ever-increasing traffic volumes and limited link bandwidth are becoming growingly intense. Besides, the distributed nature of data-parallel applications exhibits a unique correlated traffic pattern where a job is considered completed only when the coflow—flows of all the constituent tasks—has finished. In face of the challenges, this thesis presents a systematic study to ensure the progress of network communications confronting data-parallel applications. Our first insight is that, data locality should be fully exploited to reduce network transfers, thus alleviating link contention and accelerating application progress. We propose Custody, a cluster management framework that non-intrusively retrieves locality information on input data blocks and assigns machines with local data to applications in a fair fashion by solving the data-aware resource sharing problem. Even with data locality in hand, however, network transfers are still inevitable and oftentimes enormous. Therefore, network isolation should be provided so that the worst case performance of each service is assured. We propose our solution Libra that maximizes network isolation guarantee by adjusting the placement of task containers. Our next endeavor is to navigate the fairness and efficiency trade-off for data-parallel applications. Fairness ensures the progress of each application, but on the other hand, it impedes the overall performance such as average coflow completion time (CCT). To bridge the gap, we design a new coflow scheduler Coflex that exposes a tunable fairness knob to adjust the isolation guarantee, while at the same time decreasing the average CCT with the remaining bandwidth. Finally, since more and more applications are required to finish within deadlines, we shift our focus towards deadline-aware scheduling. We present Chronos, a scheduling framework that captures the deadline-aware semantics and allocates network resources among multiple concurrent coflow applications.
| Date of Award | 2018 |
|---|
| Original language | English |
|---|
| Awarding Institution | - The Hong Kong University of Science and Technology
|
|---|
On the fairness of network resource allocation for data-parallel applications
MA, S. (Author). 2018
Student thesis: Doctoral thesis