DECISION TREE-BASED TRIPHONES ARE ROBUST AND PRACTICAL FOR MANDARIAN SPEECH RECOGNITION

Yi Liu, Pascale Fung

Research output: Contribution to conferenceConference Paperpeer-review

Abstract

In large-vocabulary, speaker-independent speech recognition systems, modeling of vocabulary words by subword units is mandatory. This paper studies the use of triphone units for Mandarin speech recognition compared to biphone and context-independent phonetic units. In order to solve unseen triphones in speech recognition, decision-tree based clustering is used in triphone units. This method achieves high recognition performance with limited training data and also reduces the model training time. The robustness and effectiveness of the cross-word, tree-based triphone units have been proved by the speaker-independent continuous Mandarin speech recognition task. The training computation time reduces by about 2.3 times after tying states for triphone models, the recognition syllable accuracy increases 28.7% compared to monophone units and by 13.5% compared to biphone units.

Original languageEnglish
Pages895-898
Number of pages4
Publication statusPublished - 1999
Event6th European Conference on Speech Communication and Technology, EUROSPEECH 1999 - Budapest, Hungary
Duration: 5 Sept 19999 Sept 1999

Conference

Conference6th European Conference on Speech Communication and Technology, EUROSPEECH 1999
Country/TerritoryHungary
CityBudapest
Period5/09/999/09/99

Bibliographical note

Publisher Copyright:
© 1999 6th European Conference on Speech Communication and Technology, EUROSPEECH 1999. All rights reserved.

Fingerprint

Dive into the research topics of 'DECISION TREE-BASED TRIPHONES ARE ROBUST AND PRACTICAL FOR MANDARIAN SPEECH RECOGNITION'. Together they form a unique fingerprint.

Cite this