TY - GEN
T1 - Boosting with anti-models for automatic language identification
AU - Yang, Xi
AU - Siu, Man Hung
AU - Gish, Herbert
AU - Mak, Brian
PY - 2007
Y1 - 2007
N2 - In this paper, we adopt the boosting framework to improve the performance of acoustic-based Gaussian mixture model (GMM) Language Identification (LID) systems. We introduce a set of low-complexity, boosted target and anti-models that are estimated from training data to improve class separation, and these models are integrated during the LID backend process. This results in a fast estimation process. Experiments were performed on the 12-language, NIST 2003 language recognition evaluation classification task using a GMM-acoustic-score- only LID system, as well as the one that combines GMM acoustic scores with sequence language model scores from GMM tokenization. Classification errors were reduced from 18.8% to 10.5% on the acoustic-score-only system, and from 11.3% to 7.8% on the combined acoustic and tokenization system.
AB - In this paper, we adopt the boosting framework to improve the performance of acoustic-based Gaussian mixture model (GMM) Language Identification (LID) systems. We introduce a set of low-complexity, boosted target and anti-models that are estimated from training data to improve class separation, and these models are integrated during the LID backend process. This results in a fast estimation process. Experiments were performed on the 12-language, NIST 2003 language recognition evaluation classification task using a GMM-acoustic-score- only LID system, as well as the one that combines GMM acoustic scores with sequence language model scores from GMM tokenization. Classification errors were reduced from 18.8% to 10.5% on the acoustic-score-only system, and from 11.3% to 7.8% on the combined acoustic and tokenization system.
KW - Boosting
KW - Discriminative training
KW - Language identification
UR - https://www.scopus.com/pages/publications/56149124996
M3 - Conference Paper published in a book
AN - SCOPUS:56149124996
SN - 9781605603162
T3 - International Speech Communication Association - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007
SP - 1537
EP - 1540
BT - International Speech Communication Association - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007
T2 - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007
Y2 - 27 August 2007 through 31 August 2007
ER -