A high-performance semi-supervised learning method for text chunking

Rie Kubota Ando*, Tong Zhang

*Corresponding author for this work

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

188 Citations (Scopus)

Abstract

In machine learning, whether one can build a more accurate classifier by using unlabeled data (semi-supervised learning) is an important issue. Although a number of semi-supervised methods have been proposed, their effectiveness on NLP tasks is not always clear. This paper presents a novel semi-supervised method that employs a learning paradigm which we call structural learning. The idea is to find "what good classifiers are like" by learning from thousands of automatically generated auxiliary classification problems on unlabeled data. By doing so, the common predictive structure shared by the multiple classification problems can be discovered, which can then be used to improve performance on the target problem. The method produces performance higher than the previous best results on CoNLL'00 syntactic chunking and CoNLL'03 named entity chunking (English and German).

Original languageEnglish
Title of host publicationACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages1-9
Number of pages9
ISBN (Print)1932432515, 9781932432510
Publication statusPublished - 2005
Externally publishedYes
Event43rd Annual Meeting of the Association for Computational Linguistics, ACL-05 - Ann Arbor, MI, United States
Duration: 25 Jun 200530 Jun 2005

Publication series

NameACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference43rd Annual Meeting of the Association for Computational Linguistics, ACL-05
Country/TerritoryUnited States
CityAnn Arbor, MI
Period25/06/0530/06/05

Fingerprint

Dive into the research topics of 'A high-performance semi-supervised learning method for text chunking'. Together they form a unique fingerprint.

Cite this