Abstract
Document classification plays an important role in natural language processing. Among that, keyword extraction algorithm shows its great potential in summarizing the entire document. Attention is the process of selectively concentrating on a discrete aspect of information, while ignoring other perceivable information. A new probabilistic keyword extraction algorithm is proposed, which is inspired by the visual attention mechanism. An unsupervised neural network based pre-training method is proposed for training the semantic attention based keyword extraction algorithm, which is helpful in extracting keywords with rich contextual information from the document. A bidirectional Long short-term memory network combined with the proposed semantic keyword extraction algorithm is designed for both topic and sentiment classification tasks. Experiments on four large scale datasets show that the proposed visual attention based keyword extraction algorithm gives a better performance than the baseline methods. The semantic attention based keyword extraction method is significant in summarizing the content of a document, which is very useful for large scale document classification.
| Original language | English |
|---|---|
| Pages (from-to) | 25355-25367 |
| Number of pages | 13 |
| Journal | Multimedia Tools and Applications |
| Volume | 77 |
| Issue number | 19 |
| DOIs | |
| Publication status | Published - 1 Oct 2018 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2018, Springer Science+Business Media, LLC, part of Springer Nature.
Keywords
- Document classification
- Keyword extraction
- Long short-term memory
- Semantic context
- Visual attention
Fingerprint
Dive into the research topics of 'A visual attention-based keyword extraction for document classification'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver