TY - GEN
T1 - Distance metric learning under covariate shift
AU - Cao, Bin
AU - Ni, Xiaochuan
AU - Sun, Jian Tao
AU - Wang, Gang
AU - Yang, Qiang
PY - 2011
Y1 - 2011
N2 - Learning distance metrics is a fundamental problem in machine learning. Previous distance-metric learning research assumes that the training and test data are drawn from the same distribution, which may be violated in practical applications. When the distributions differ, a situation referred to as covariate shift, the metric learned from training data may not work well on the test data. In this case the metric is said to be inconsistent. In this paper, we address this problem by proposing a novel metric learning framework known as consistent distance metric learning (CDML), which solves the problem under covariate shift situations. We theoretically analyze the conditions when the metrics learned under covariate shift are consistent. Based on the analysis, a convex optimization problem is proposed to deal with the CDML problem. An importance sampling method is proposed for metric learning and two importance weighting strategies are proposed and compared in this work. Experiments are carried out on synthetic and real world datasets to show the effectiveness of the proposed method.
AB - Learning distance metrics is a fundamental problem in machine learning. Previous distance-metric learning research assumes that the training and test data are drawn from the same distribution, which may be violated in practical applications. When the distributions differ, a situation referred to as covariate shift, the metric learned from training data may not work well on the test data. In this case the metric is said to be inconsistent. In this paper, we address this problem by proposing a novel metric learning framework known as consistent distance metric learning (CDML), which solves the problem under covariate shift situations. We theoretically analyze the conditions when the metrics learned under covariate shift are consistent. Based on the analysis, a convex optimization problem is proposed to deal with the CDML problem. An importance sampling method is proposed for metric learning and two importance weighting strategies are proposed and compared in this work. Experiments are carried out on synthetic and real world datasets to show the effectiveness of the proposed method.
UR - https://www.scopus.com/pages/publications/84867732087
U2 - 10.5591/978-1-57735-516-8/IJCAI11-205
DO - 10.5591/978-1-57735-516-8/IJCAI11-205
M3 - Conference Paper published in a book
AN - SCOPUS:84867732087
SN - 9781577355120
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 1204
EP - 1210
BT - IJCAI 2011 - 22nd International Joint Conference on Artificial Intelligence
T2 - 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011
Y2 - 16 July 2011 through 22 July 2011
ER -