Skip to main navigation Skip to search Skip to main content

Evaluating the Word Sense Disambiguation Performance of Statistical Machine Translation

Research output: Contribution to conferenceConference Paperpeer-review

Abstract

We present the first known empirical test of an increasingly common speculative claim, by evaluating a representative Chinese-to-English SMT model directly on word sense disambiguation performance, using standard WSD evaluation methodology and datasets from the Senseval-3 Chinese lexical sample task. Much effort has been put in designing and evaluating dedicated word sense disambiguation (WSD) models, in particular with the Senseval series of workshops. At the same time, the recent improvements in the BLEU scores of statistical machine translation (SMT) suggests that SMT models are good at predicting the right translation of the words in source language sentences. Surprisingly however, the WSD accuracy of SMT models has never been evaluated and compared with that of the dedicated WSD models. We present controlled experiments showing the WSD accuracy of current typical SMT models to be significantly lower than that of all the dedicated WSD models considered. This tends to support the view that despite recent speculative claims to the contrary, current SMT models do have limitations in comparison with dedicated WSD models, and that SMT should benefit from the better predictions made by the WSD models.
Original languageEnglish
Pages120-125
Number of pages6
Publication statusPublished - 2005
Event2nd International Joint Conference on Natural Language Processing Companion, IJCNLP 2005 - Jeju Island, Korea, Republic of
Duration: 11 Oct 200513 Oct 2005

Conference

Conference2nd International Joint Conference on Natural Language Processing Companion, IJCNLP 2005
Country/TerritoryKorea, Republic of
CityJeju Island
Period11/10/0513/10/05

Bibliographical note

Publisher Copyright:
© 2005 Asian Federation of Natural Language Processing.

Fingerprint

Dive into the research topics of 'Evaluating the Word Sense Disambiguation Performance of Statistical Machine Translation'. Together they form a unique fingerprint.

Cite this