Information-theoretic metric learning
Top Cited Papers
- 20 June 2007
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 209-216
- https://doi.org/10.1145/1273496.1273523
Abstract
In this paper, we present an information-theoretic approach to learning a Mahalanobis distance function. We formulate the problem as that of minimizing the differential relative entropy between two multivariate Gaussians under constraints on the distance function. We express this problem as a particular Bregman optimization problem---that of minimizing the LogDet divergence subject to linear constraints. Our resulting algorithm has several advantages over existing methods. First, our method can handle a wide variety of constraints and can optionally incorporate a prior on the distance function. Second, it is fast and scalable. Unlike most existing methods, no eigenvalue computations or semi-definite programming are required. We also present an online version and derive regret bounds for the resulting algorithm. Finally, we evaluate our method on a recent error reporting system for software called Clarify, in the context of metric learning for nearest neighbor classification, as well as on standard data setsKeywords
This publication has 8 references indexed in Scilit:
- Improved error reporting for software that uses black-box componentsPublished by Association for Computing Machinery (ACM) ,2007
- Metric learning for text documentsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2006
- Learning low-rank kernel matricesPublished by Association for Computing Machinery (ACM) ,2006
- Learning a Similarity Metric Discriminatively, with Application to Face VerificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- A probabilistic framework for semi-supervised clusteringPublished by Association for Computing Machinery (ACM) ,2004
- Online and batch learning of pseudo-metricsPublished by Association for Computing Machinery (ACM) ,2004
- Exponentiated Gradient versus Gradient Descent for Linear PredictorsInformation and Computation, 1997
- Discriminant adaptive nearest neighbor classificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1996