Active learning using adaptive resampling

Vijay S. Iyengar*, Chidanand Apte, Tong Zhang

*Corresponding author for this work

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

73 Citations (Scopus)

Abstract

Classification modeling (a.k.a. supervised learning) is an extremely useful analytical technique for developing predictive and forecasting applications. The explosive growth in data warehousing and internet usage has made large amounts of data potentially available for developing classification models. For example, natural language text is widely available in many forms (e.g., electronic mail, news articles, reports, and web page contents). Categorization of data is a common activity which can be automated to a large extent using supervised learning methods. Examples of this include routing of electronic mail, satellite image classification, and character recognition. However, these tasks require labeled data sets of sufficiently high quality with adequate instances for training the predictive models. Much of the on-line data, particularly the unstructured variety (e.g., text), is unlabeled. Labeling is usually a expensive manual process done by domain experts. Active learning is an approach to solving this problem and works by identifying a subset of the data that needs to be labeled and uses this subset to generate classification models. We present an active learning method that uses adaptive resampling in a natural way to significantly reduce the size of the required labeled set and generates a classification model that achieves the high accuracies possible with current adaptive resampling methods.

Original languageEnglish
Title of host publicationProceeding of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
EditorsR. Ramakrishnan, S. Stolfo, R. Bayardo, I. Parsa
PublisherAssociation for Computing Machinery (ACM)
Pages91-98
Number of pages8
ISBN (Print)1581132336, 9781581132335
DOIs
Publication statusPublished - 2000
Externally publishedYes
EventProceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001) - Boston, MA, United States
Duration: 20 Aug 200023 Aug 2000

Publication series

NameProceeding of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference

ConferenceProceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001)
Country/TerritoryUnited States
CityBoston, MA
Period20/08/0023/08/00

Keywords

  • Active learning
  • Adaptive resampling
  • Classification
  • Data mining
  • Machine learning

Fingerprint

Dive into the research topics of 'Active learning using adaptive resampling'. Together they form a unique fingerprint.

Cite this