High-dimensional nonconvex eictionary learning : theories, algorithms, and applications

  • Ye XUE

Student thesis: Doctoral thesis

Abstract

In the era of big data, the size and the dimension of the data have already reached an unprecedented scale and they are still experiencing explosive growth. This brought a huge challenge for scientists and engineers to process and analyze data in a traditional way. New data processing paradigms such as machine learning techniques which rely more on the huge volume of data than the rigorously derived model have achieved great success. However, there are several mysteries behind this success. These unsolved mysteries prevent a more sustainable development of machine learning. Two core questions that need to answer are

• How many samples does a specific machine learning scheme require to achieve a satisfactory result?

• How many computation resources are required by a specific machine learning scheme to achieve a satisfactory result?

Answering these two questions can not only aid to understand fundamental principles of machine learning but also help in the design of better learning methods to save both the data resources and the computation resources. However, due to the high dimensionality and the non-convexity in most modern machine learning problems, these two questions are extremely hard to answer. In this thesis, we investigate an essential and powerful machine learning problem, high-dimensional non-convex dictionary learning (also known as sparse coding), to try to answer the above two questions and design sample-efficient and computation-efficient dictionary learning schemes.

First, we investigate a family of ℓp-norm (p > 2) maximization approaches for the com-plete dictionary learning problem from theoretical and algorithmic aspects. Specifically, we provide the sample complexity for the consistency of the ℓp-norm (p > 2) maximization formulation. The sample complexity shows us that p = 3 performs the best. Based on the generalized power method (GPM), an efficient algorithm is then developed for the ℓp-based formulations. We further show the convergence of the developed algorithm to the non-convex ℓp-norm (p > 2) maximization problem.

Next, we propose a sample-efficient dictionary learning scheme with a two-stage optimization based on the original ℓ3-norm maximization approach. The proposed scheme leverages the global and local Riemannian geometry of the two-stage optimization prob-lem and facilitates fast implementation for superb dictionary recovery performance by a finite number of samples. We further prove that, with high probability, the proposed scheme can exactly recover any atom in the target dictionary with a finite number of samples.

Then, we propose a computation-efficient online orthogonal dictionary learning scheme to dynamically learn the dictionary from streaming data, without storing the historical data and can be implemented with single sample or a mini-batch of samples per itera-tion. The proposed scheme includes a novel problem formulation and an efficient online algorithm design with convergence analysis. In the problem formulation, we relax the orthogonal constraint to enable an efficient online algorithm. We then propose the design of a new Frank-Wolfe-based online algorithm with a convergence rate of O(ln t/t1/4). The convergence rate in terms of key system parameters is also derived.

Finally, we apply the above sample-efficient algorithm to blind data detection in Massive MIMO systems and adopt the computation-efficient algorithm to IoT sensor data compression with domain-specific designs. Extensive experiments with synthetic data and real-world data demonstrate the effectiveness and efficiency of the proposed schemes.

Date of Award2022
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology
SupervisorVincent Kin Nang LAU (Supervisor)

Cite this

'