Communication-efficient wireless network and training algorithm design for machine learning applications

  • Liqun SU

Student thesis: Doctoral thesis

Abstract

Federated learning (FL) has emerged as a widely used collaborative training framework that leverages local datasets from distributed sensors. By aggregating the local gradients, model updates, or intermediate computation results from local sensors, the central server can iteratively update the model weights of the neural network. However, practical communication constraints of the network can induce transmission noise in the training process, which jeopardizes the training performance. Thus, it is essential to develop communication-efficient transmission policies and training algorithms that can adapt to practical training scenarios. We first focus on the conventional horizontal FL framework and explore over-the-air gradient aggregation to reduce communication complexity. For the model update dynamics, we propose a dynamic residual feedback mechanism, where each sensor keeps track of a local residual to store the untransmitted gradients. This mechanism helps to address the potential bias caused by channel and data distortion. On the other hand, we utilize the Lyapunov drift optimization method to analyze the relationship between training gain and resource allocation. This motivates a decentralized scheduling and power control policy that is adaptive to both the CSI and data importance, which helps to seize good transmission opportunities to upload important gradients. Then, noticing that in modern IoT networks, multiple groups of multi-type sensors are utilized for data collection and the training of AI models. Each sensor collects partial data samples on a type-specific feature space, resulting in a hybrid data partitioning across local datasets. Unlike conventional horizontal and vertical FL, such hybrid FL induces both feature and sample diversity, which poses challenges for designing a scalable training framework. To tackle this issue, we propose a hierarchical federated learning framework with a multi-tier partitioned NN architecture. Specifically, we adopt a primal-dual transform to decompose the training problem on both the sample and feature space. Then, a stochastic gradient descent ascent algorithm is implemented with intra-type and inter-type over-the-air aggregation for the update of the primal variables and dual variables, respectively. Furthermore, we continue to focus on the implementation of hybrid FL in digital transmission scenarios. We propose a stochastic primal-descent dual-ascent training method with a two-side residual feedback mechanism for variables update. Then, we propose a decentralized joint scheduling, resource allocation, and dynamic quantization policy, which is adaptive not only to the CSI but also to the instantaneous gradient importance and dynamic gradient statistics, contributing to better training performance under limited communication resources. Lastly, we shift our focus to training algorithm design that can utilize the characteristics of practical transmission scenarios and accelerate the training process. To gain clearer insights into the model update dynamics, we use a stochastic differential equation to model the discrete-time training trajectory. Through high-order drift approximation analysis on a general momentum-based SDE model, we propose a dynamic step size design for the update rule. This design is adaptive to both the training state and gradient quality, ensuring that the training algorithm can seize good update opportunities and avoid noise explosion.
Date of Award2023
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology
SupervisorVincent Kin Nang LAU (Supervisor)

Cite this

'