Information-theoretic metric learning

Top Cited Papers

20 June 2007

proceedings article
Published by Association for Computing Machinery (ACM)

p. 209-216
https://doi.org/10.1145/1273496.1273523

Abstract

In this paper, we present an information-theoretic approach to learning a Mahalanobis distance function. We formulate the problem as that of minimizing the differential relative entropy between two multivariate Gaussians under constraints on the distance function. We express this problem as a particular Bregman optimization problem---that of minimizing the LogDet divergence subject to linear constraints. Our resulting algorithm has several advantages over existing methods. First, our method can handle a wide variety of constraints and can optionally incorporate a prior on the distance function. Second, it is fast and scalable. Unlike most existing methods, no eigenvalue computations or semi-definite programming are required. We also present an online version and derive regret bounds for the resulting algorithm. Finally, we evaluate our method on a recent error reporting system for software called Clarify, in the context of metric learning for nearest neighbor classification, as well as on standard data sets

Keywords

This publication has 8 references indexed in Scilit:

Improved error reporting for software that uses black-box components
Published by Association for Computing Machinery (ACM) ,2007
Metric learning for text documents
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Learning low-rank kernel matrices
Published by Association for Computing Machinery (ACM) ,2006
Learning a Similarity Metric Discriminatively, with Application to Face Verification
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
A probabilistic framework for semi-supervised clustering
Published by Association for Computing Machinery (ACM) ,2004
Online and batch learning of pseudo-metrics
Published by Association for Computing Machinery (ACM) ,2004
Exponentiated Gradient versus Gradient Descent for Linear Predictors
Information and Computation, 1997
Discriminant adaptive nearest neighbor classification
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1996