In this thesis, we focus on two problems, i.e. graph learning and clustering. On the problem of learning the graph from the observation of the signals over the nodes, we propose a new framework to effectively and efficiently solve it. Specifically, we focus on estimating the direct relationships between different nodes in an undirected graph, given the observable indirect relationships implied by a diffusion process over the graph. Based on the fact that the spectral template of the covariance matrix, i.e., the eigenvectors, coincides with that of the graph shift operator, we propose multiple formulations and the corresponding algorithms to estimate the operator: i) we introduce a Frobenius norm penalty element to learn a sparse graph structure and solve it with the majorization-minimization and block coordinate descent frameworks, achieving a significantly smaller recovery error and faster convergence speed than the benchmark methods; ii) not limited by some approximate functions for the l
0-‘norm’, we derive a new method to minimize l
0-‘norm’ directly, achieving even smaller error; and iii) considering prior knowledge of miscellaneous graph topologies in practice, we derive a new family of algorithms to learn the graph shift operator following specific topologies, such as k-sparse graphs, trees, forests, and bipartite graphs, given spectral templates. Our methods can be used to recover the graph structure as long as the spectral template is available. Extensive numerical experiments demonstrate that our methods can be used to estimate the weighted adjacency matrix and inferring meaningful connections efficiently and effectively. The results show that our proposed methods can achieve superior or equivalent performance to the benchmark methods on all the considered tasks. When the weighted adjacency matrix, i.e. the graph structure, is known, some further works can be done to retrieve more valuable information rooted in the graphs. Graph clustering is one of them, which is to allocate those connected closely vertices into the same clusters while dividing those connected weakly into different ones. In the field of graph clustering, directed graph (digraph) clustering is more challenging than the undirected one, because directed edges, unlike undirected ones, do not indicate an essential feature for finding plausible clusters, the connectivity, in a simple and obvious way. Naturally, the interactions between vertices and the clusters are based on the bi-directional paths between them. Thus, strong connectivity built by interactions can be implied by the massive bi-directional connections. Observed this point, we propose the normalized mutual cut (NMcut) for evaluating the goodness of clustering in digraphs. It is justified that the clusters of the balanced volumes with inner vertices connected strongly in double directions can be found with low NMcut. Meanwhile, because the numbers and strengths of the paths between a pair of vertices or subgraphs may not be equal reciprocally, the flows along these paths may also follow specific orientations. We name these orientations “tendencies” and try to find a powerful method that can retrieve both sensible clusters but also their pair-wise tendencies. Although satisfactory clusters can be obtained with a low NMcut, we prove that minimizing it is NP-hard. Then, to solve the problem efficiently, for its particular case on two clusters, we embed it into continuous variables and propose an algorithm with closed-form solutions. Based on this, a “top-down” hierarchical, a.k.a. “divisive”, algorithm is proposed to retrieve any number of clusters with low pair-wise connectivity. Furthermore, we derive a simultaneous algorithm to obtain clusters of low global connectivity by embedding vertices. The tendencies between clusters are shown to be indicated by the algorithms without extra costs. We prove the convergence of the algorithms and verify their effectiveness and efficiency by extensive synthetic and empirical experiments. The experimental results show that our methods can achieve superior or equivalent performance to the benchmark approaches, with applications to the fields such as the community detection.
| Date of Award | 2019 |
|---|
| Original language | English |
|---|
| Awarding Institution | - The Hong Kong University of Science and Technology
|
|---|