TY - GEN
T1 - Diverse topic phrase extraction through latent semantic analysis
AU - Chen, Jilin
AU - Yan, Jun
AU - Zhang, Benyu
AU - Yang, Qiang
AU - Chen, Zheng
PY - 2006
Y1 - 2006
N2 - We propose a novel algorithm for extracting diverse topic phrases in order to provide summary for large corpora. Previous works often ignore the importance of diversity and thus extract phrases crowded on some hot topics while failing to cover other less obvious but important topics. We solve this problem through document re-weighting and phrase diversification by using latent semantic analysis (LSA). Experiments on various datasets show that our new algorithm can improve relevance as well as diversity over different topics for topic phrase extraction problems.
AB - We propose a novel algorithm for extracting diverse topic phrases in order to provide summary for large corpora. Previous works often ignore the importance of diversity and thus extract phrases crowded on some hot topics while failing to cover other less obvious but important topics. We solve this problem through document re-weighting and phrase diversification by using latent semantic analysis (LSA). Experiments on various datasets show that our new algorithm can improve relevance as well as diversity over different topics for topic phrase extraction problems.
UR - https://www.scopus.com/pages/publications/84878029280
U2 - 10.1109/ICDM.2006.61
DO - 10.1109/ICDM.2006.61
M3 - Conference Paper published in a book
AN - SCOPUS:84878029280
SN - 0769527019
SN - 9780769527017
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 834
EP - 838
BT - Proceedings - Sixth International Conference on Data Mining, ICDM 2006
T2 - 6th International Conference on Data Mining, ICDM 2006
Y2 - 18 December 2006 through 22 December 2006
ER -