Modeling pronunciation variation using context-dependent weighting and B/S refined acoustic modeling

Fang Zheng, Zhanjiang Song, Pascale Fung, William Byrne

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

4 Citations (Scopus)

Abstract

The pronunciation variability is an important issue that must be faced with when developing practical automatic spontaneous speech recognition systems. By studying the initial/final (IF) characteristics of Chinese language and developing the Bayesian equation, we propose the concepts of generalized initial/final (GIF) and generalized syllable (GS), the GIF modeling method and the IF-GIF modeling method, as well as the context-dependent pronunciation weighting method. By using these approaches, the IF-GIF modeling reduces the Chinese syllable error rate (SER) by 6.3% and 4.2% compared with the GIF modeling and IF modeling respectively when the language modeling, such as syllable or word N-gram, is not used.

Original languageEnglish
Title of host publicationEUROSPEECH 2001 - SCANDINAVIA - 7th European Conference on Speech Communication and Technology
EditorsBorge Lindberg, Henrik Benner, Paul Dalsgaard, Zheng-Hua Tan
PublisherInternational Speech Communication Association
Pages57-60
Number of pages4
ISBN (Electronic)8790834100, 9788790834104
Publication statusPublished - 2001
Event7th European Conference on Speech Communication and Technology - Scandinavia, EUROSPEECH 2001 - Aalborg, Denmark
Duration: 3 Sept 20017 Sept 2001

Publication series

NameEUROSPEECH 2001 - SCANDINAVIA - 7th European Conference on Speech Communication and Technology

Conference

Conference7th European Conference on Speech Communication and Technology - Scandinavia, EUROSPEECH 2001
Country/TerritoryDenmark
CityAalborg
Period3/09/017/09/01

Fingerprint

Dive into the research topics of 'Modeling pronunciation variation using context-dependent weighting and B/S refined acoustic modeling'. Together they form a unique fingerprint.

Cite this