TY - GEN
T1 - Classification via minimum incremental coding length (MICL)
AU - Wright, John
AU - Ma, Yi
AU - Tao, Yangyu
AU - Lin, Zhouchen
AU - Shum, Heung Yeung
PY - 2008
Y1 - 2008
N2 - We present a simple new criterion for classification, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the minimum number of additional bits to code the test sample, subject to an allowable distortion. We prove asymptotic optimality of this criterion for Gaussian data and analyze its relationships to classical classifiers. Theoretical results provide new insights into relationships among popular classifiers such as MAP and RDA, as well as unsupervised clustering methods based on lossy compression [13]. Minimizing the lossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small-sample setting. Compression also provides a uniform means of handling classes of varying dimension. This simple classification criterion and its kernel and local versions perform competitively against existing classifiers on both synthetic examples and real imagery data such as handwritten digits and human faces, without requiring domain-specific information.
AB - We present a simple new criterion for classification, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the minimum number of additional bits to code the test sample, subject to an allowable distortion. We prove asymptotic optimality of this criterion for Gaussian data and analyze its relationships to classical classifiers. Theoretical results provide new insights into relationships among popular classifiers such as MAP and RDA, as well as unsupervised clustering methods based on lossy compression [13]. Minimizing the lossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small-sample setting. Compression also provides a uniform means of handling classes of varying dimension. This simple classification criterion and its kernel and local versions perform competitively against existing classifiers on both synthetic examples and real imagery data such as handwritten digits and human faces, without requiring domain-specific information.
UR - http://www.scopus.com/inward/record.url?scp=85161960953&partnerID=8YFLogxK
M3 - Conference Paper published in a book
AN - SCOPUS:85161960953
SN - 160560352X
SN - 9781605603520
T3 - Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference
BT - Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference
PB - Neural Information Processing Systems
T2 - 21st Annual Conference on Neural Information Processing Systems, NIPS 2007
Y2 - 3 December 2007 through 6 December 2007
ER -