TY - GEN
T1 - Question classification by approximating semantics
AU - Feng, Guangyu
AU - Xiong, Kun
AU - Tang, Yang
AU - Cui, Anqi
AU - Bai, Jing
AU - Li, Hang
AU - Yang, Qiang
AU - Li, Ming
PY - 2015/5/18
Y1 - 2015/5/18
N2 - A central task of computational linguistics is to decide if two pieces of texts have similar meanings. Ideally, this depends on an intuitive notion of semantic distance. While this semantic distance is most likely unde nable and uncomputable, in practice it is approximated heuristically, consciously or unconsciously. In this paper, we present a theory, and its implementation, to approximate the elusive semantic distance by the well-de ned information distance. It is mathematically proven that any computable approximation of the intuitive concept of semantic distance is covered" by our theory. We have implemented our theory to question answering (QA) and performed experiments based on data extracted from over 35 million question-answer pairs. Experiments demonstrate that our initial implementation of the theory produces convincingly fewer errors inspecification compared to other academic models and commercial systems.
AB - A central task of computational linguistics is to decide if two pieces of texts have similar meanings. Ideally, this depends on an intuitive notion of semantic distance. While this semantic distance is most likely unde nable and uncomputable, in practice it is approximated heuristically, consciously or unconsciously. In this paper, we present a theory, and its implementation, to approximate the elusive semantic distance by the well-de ned information distance. It is mathematically proven that any computable approximation of the intuitive concept of semantic distance is covered" by our theory. We have implemented our theory to question answering (QA) and performed experiments based on data extracted from over 35 million question-answer pairs. Experiments demonstrate that our initial implementation of the theory produces convincingly fewer errors inspecification compared to other academic models and commercial systems.
KW - Information distance
KW - Question answering
KW - Semantic distance
KW - Textspecification
UR - https://www.scopus.com/pages/publications/84968542274
U2 - 10.1145/2740908.2745403
DO - 10.1145/2740908.2745403
M3 - Conference Paper published in a book
AN - SCOPUS:84968542274
T3 - WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web
SP - 407
EP - 417
BT - WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web
PB - Association for Computing Machinery, Inc
T2 - 24th International Conference on World Wide Web, WWW 2015
Y2 - 18 May 2015 through 22 May 2015
ER -