Acoustic and phonetic confusions in accented speech recognition

Yi Liu*, Pascale Fung

*Corresponding author for this work

Research output: Contribution to conferenceConference Paperpeer-review

Abstract

Accented speech recognition is more challenging than standard speech recognition due to the effects of phonetic and acoustic confusions. Phonetic confusion in accented speech occurs when an expected phone is pronounced as a different one, which leads to erroneous recognition. Acoustic confusion occurs when the pronounced phone is found to lie acoustically between two baseform models and can be equally recognized as either one. We propose that it is necessary to analyze and model these confusions separately in order to improve accented speech recognition without degrading standard speech recognition. We propose using likelihood ratio test to measure phonetic confusion, and asymmetric acoustic distance to measure acoustic confusion. Only accent-specific phonetic units with low acoustic confusion are used in an augmented pronunciation dictionary, while phonetic models with high acoustic confusion are reconstructed using decision tree merging. Experimental results show that our approach is effective and superior to methods modeling phonetic confusion or acoustic confusion alone in accented speech, with a significant 5.7% absolute WER reduction, without degrading standard speech recognition.

Original languageEnglish
Pages3033-3036
Number of pages4
Publication statusPublished - 2005
Event9th European Conference on Speech Communication and Technology - Lisbon, Portugal
Duration: 4 Sept 20058 Sept 2005

Conference

Conference9th European Conference on Speech Communication and Technology
Country/TerritoryPortugal
CityLisbon
Period4/09/058/09/05

Fingerprint

Dive into the research topics of 'Acoustic and phonetic confusions in accented speech recognition'. Together they form a unique fingerprint.

Cite this