Skip to main navigation Skip to search Skip to main content

Network compression via quantization and sparsification

  • Lu HOU

Student thesis: Doctoral thesis

Abstract

Deep neural network models, though very powerful and highly successful, are computationally expensive in terms of space and time. Recently, there have been a number of attempts on compressing the network. These attempts greatly reduce the network size, and allow the possibility of deploying deep models in resource-constrained environments. In this thesis, we focus on two kinds of network compression methods: quantization and sparsification. We first propose to directly minimize the loss w.r.t. the quantized weights by using the proximal Newton algorithm. We provide a closed-form solution for binarization, as well as an efficient approximate solution for ternarization and m-bit (where m > 2) quantization. To speed up distributed training of weight-quantized networks, we then propose to use gradient quantization to reduce the communication cost, and theoretically study how the combination of weight and gradient quantization affects convergence. In addition, since previous quantization methods usually have inferior performance on LSTMs, we study why training quantized LSTMs is difficult, and show that popular normalization schemes can help stabilize the training of quantized LSTMs. While weight quantization reduces redundancy in weight representation, network sparsification can reduce redundancy in the number of weights. To achieve a higher compression rate, we extend the previous quantization-only formulation to a more general network compression framework, which allows simultaneous quantization and sparsification. Finally, we find that sparse deep neural networks obtained by pruning resemble biological networks in many ways. Inspired by the power law distributions of many biological networks, we show that these pruned deep networks also exhibit properties of the power law, and these properties can be used for faster learning and smaller networks in continual learning.
Date of Award2019
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology

Cite this

'