Exploring dependencies in complex input and complex output machine learning problems

  • Elham JEBALBAREZI SARBIJAN

Student thesis: Doctoral thesis

Abstract

Multi-input and multi-output machine learning are some of the chief challenges in the era of big data (Variety of the data). These big datasets are too large and too complex to be handled by traditional machine learning methods and new solutions must be found. In this thesis, we investigate the effect of dependencies between multiple input and multiple output, and we show that these dependencies help to solve the problems in a more accurate and less expensive way with fewer parameters. We choose prediction tasks on multi-label learning where each label is equivalent to an output task, and multimodal learning where each modality is equivalent to an input channel, as case studies Multi-label learning is an example of an extreme classification task on an extremely large number of labels (tags). User generated labels for any type of online data can be sparse in terms of the individual user but intractably large among all users. For example, in web and document categorization, image semantic analysis, protein function detection and social network analysis, multiple outputs must be predicted simultaneously. In these problems, modelling output label dependencies improves the output predictions. Many of the existing algorithms do not adequately address multi-label classifications with label dependencies and a large number of labels. In this thesis, we investigate multi-label classification with dependencies between many labels. We can then efficiently solve the problem of multi-label learning with an intractably large number of interdependent labels, such as the automatic tagging of Wikipedia pages. In this thesis, we have studied the nature of label dependencies and the efficiency of distributed multi-label learning methods. Then, we have proposed an assumption-free label sampling approach to handle a huge number of the labels. Finally, we have investigated and compared chain-ordered label dependency and order-free learning methods for multi-label datasets. In the second part of our dependency challenge investigation, we investigate multimodal learning complexities, as most of the learning tasks include several sensory modalities, such as vision and speech, which represent our primary channels of communication and perception. We focus on how to utilize the modality dependencies for multimodal fusion in order to integrate information from two or more modalities for better prediction. Our aim is to understand and modulate the relative contribution of each modality in multimodal inference tasks by investigating input modality dependencies. Moreover, we propose some solutions to solve the curse of dimensionality which happens by high-order integratiion of the data from several sources. We make several contributions to multimodal data processing: First, we have investigated various basic fusion methods. In contrast to the previous approaches which use simple linear or concatenation approaches, we propose to generate an (M + 1)-way high-order dependency structure (tensor) to consider the high-order relationships between M modalities and the output layer of a neural network model. Applying a modality-based tensor factorization method, which adopts different factors for different modalities, results in removing information present in a modality that can be compensated by other modalities, with respect to the model outputs. Moreover, this modality-based tensor factorization approach helps in understanding of the relative utility of information in each modality and handles the scale issues of the problem. In addition, it leads to a less complicated model with fewer parameters and therefore could be applied as a regularizer to avoid overfitting. According to our investigations and the experimental results, we find that including the dependencies in the prediction tasks lead to the approaches with simpler models and fewer parameters, while improving the prediction results. We aim to use the challenge of the dimensionality of big data as an opportunity by extracting their dependencies and using them as extra information to solve the prediction problems. We have shown that divide and conquer based on the label dependencies results in a smaller but more accurate method in comparison to the methods which ignore the dependencies. Then, we have shown that a small subset of the labels could provide a lot of information about the remaining labels, therefore we can use a small subset to perform the prediction tasks. Then, we have investigated the order-based dependency extraction vs order-free methods which concludes the superiority of the order-free methods which are more general and accurate especially for the larger datasets. We have shown that a high-order integration of the modalities represents more information of the inter and intra modality dependencies, however it suffers from the polynomial growth of the dimensionality. Therefore, we propose a fully differentiable framework based on tensor factorization which could be included in any neural based learning method. In a nutshell, our results demonstrate that the dependencies between multiple inputs or outputs could help to make the problem simpler, smaller, and easier to train by combining the prediction tasks with dependency-based sampling, compression, or clustering methods.
Date of Award2020
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology
SupervisorPascale Ngan FUNG (Supervisor)

Cite this

'