A de novo divide-and-merge paradigm for acoustic model optimization in automatic speech recognition

Conghui Tan*, Di Jiang, Jinhua Peng, Xueyang Wu, Qian Xu, Qiang Yang

*Corresponding author for this work

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

Abstract

Due to the rising awareness of privacy protection and the voluminous scale of speech data, it is becoming infeasible for Automatic Speech Recognition (ASR) system developers to train the acoustic model with complete data as before. In this paper, we propose a novel Divide-and-Merge paradigm to solve salient problems plaguing the ASR field. In the Divide phase, multiple acoustic models are trained based upon different subsets of the complete speech data, while in the Merge phase two novel algorithms are utilized to generate a high-quality acoustic model based upon those trained on data subsets. We first propose the Genetic Merge Algorithm (GMA), which is a highly specialized algorithm for optimizing acoustic models but suffers from low efficiency. We further propose the SGD-Based Optimizational Merge Algorithm (SOMA), which effectively alleviates the efficiency bottleneck of GMA and maintains superior performance. Extensive experiments on public data show that the proposed methods can significantly outperform the state-of-the-art.

Original languageEnglish
Title of host publicationProceedings of the 29th International Joint Conference on Artificial Intelligence, IJCAI 2020
EditorsChristian Bessiere
PublisherInternational Joint Conferences on Artificial Intelligence
Pages3709-3715
Number of pages7
ISBN (Electronic)9780999241165
Publication statusPublished - 2020
Externally publishedYes
Event29th International Joint Conference on Artificial Intelligence, IJCAI 2020 - Yokohama, Japan
Duration: 1 Jan 2021 → …

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
Volume2021-January
ISSN (Print)1045-0823

Conference

Conference29th International Joint Conference on Artificial Intelligence, IJCAI 2020
Country/TerritoryJapan
CityYokohama
Period1/01/21 → …

Bibliographical note

Publisher Copyright:
© 2020 Inst. Sci. inf., Univ. Defence in Belgrade. All rights reserved.

Fingerprint

Dive into the research topics of 'A de novo divide-and-merge paradigm for acoustic model optimization in automatic speech recognition'. Together they form a unique fingerprint.

Cite this