TY - GEN
T1 - Focused named entity recognition using machine learning
AU - Zhang, Li
AU - Pan, Yue
AU - Zhang, Tong
PY - 2004
Y1 - 2004
N2 - In this paper we study the problem of finding most topical named entities among all entities in a document, which we refer to as focused named entity recognition. We show that these focused named entities are useful for many natural language processing applications, such as document summarization, search result ranking, and entity detection and tracking. We propose a statistical model for focused named entity recognition by converting it into a classification problem. We then study the impact of various linguistic features and compare a number of classification algorithms. From experiments on an annotated Chinese news corpus, we demonstrate that the proposed method can achieve near human-level accuracy.
AB - In this paper we study the problem of finding most topical named entities among all entities in a document, which we refer to as focused named entity recognition. We show that these focused named entities are useful for many natural language processing applications, such as document summarization, search result ranking, and entity detection and tracking. We propose a statistical model for focused named entity recognition by converting it into a classification problem. We then study the impact of various linguistic features and compare a number of classification algorithms. From experiments on an annotated Chinese news corpus, we demonstrate that the proposed method can achieve near human-level accuracy.
KW - Decision tree
KW - Information retrieval
KW - Naive Bayes
KW - Robust risk minimization
KW - Text summarization
KW - Topic identification
UR - https://www.scopus.com/pages/publications/8644241114
U2 - 10.1145/1008992.1009042
DO - 10.1145/1008992.1009042
M3 - Conference Paper published in a book
AN - SCOPUS:8644241114
SN - 1581138814
SN - 9781581138818
T3 - Proceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
SP - 281
EP - 288
BT - Proceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
PB - Association for Computing Machinery (ACM)
T2 - Proceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Y2 - 25 July 2004 through 29 July 2004
ER -