Abstract
Due to the rising awareness of privacy protection and the voluminous scale of speech data, it is becoming infeasible for Automatic Speech Recognition (ASR) system developers to train the acoustic model with complete data as before. In this paper, we propose a novel Divide-and-Merge paradigm to solve salient problems plaguing the ASR field. In the Divide phase, multiple acoustic models are trained based upon different subsets of the complete speech data, while in the Merge phase two novel algorithms are utilized to generate a high-quality acoustic model based upon those trained on data subsets. We first propose the Genetic Merge Algorithm (GMA), which is a highly specialized algorithm for optimizing acoustic models but suffers from low efficiency. We further propose the SGD-Based Optimizational Merge Algorithm (SOMA), which effectively alleviates the efficiency bottleneck of GMA and maintains superior performance. Extensive experiments on public data show that the proposed methods can significantly outperform the state-of-the-art.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 29th International Joint Conference on Artificial Intelligence, IJCAI 2020 |
| Editors | Christian Bessiere |
| Publisher | International Joint Conferences on Artificial Intelligence |
| Pages | 3709-3715 |
| Number of pages | 7 |
| ISBN (Electronic) | 9780999241165 |
| Publication status | Published - 2020 |
| Externally published | Yes |
| Event | 29th International Joint Conference on Artificial Intelligence, IJCAI 2020 - Yokohama, Japan Duration: 1 Jan 2021 → … |
Publication series
| Name | IJCAI International Joint Conference on Artificial Intelligence |
|---|---|
| Volume | 2021-January |
| ISSN (Print) | 1045-0823 |
Conference
| Conference | 29th International Joint Conference on Artificial Intelligence, IJCAI 2020 |
|---|---|
| Country/Territory | Japan |
| City | Yokohama |
| Period | 1/01/21 → … |
Bibliographical note
Publisher Copyright:© 2020 Inst. Sci. inf., Univ. Defence in Belgrade. All rights reserved.
Fingerprint
Dive into the research topics of 'A de novo divide-and-merge paradigm for acoustic model optimization in automatic speech recognition'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver