An Equivalence Between Sparse Approximation and Support Vector Machines
- 1 August 1998
- journal article
- Published by MIT Press in Neural Computation
- Vol. 10 (6) , 1455-1480
- https://doi.org/10.1162/089976698300017269
Abstract
This article shows a relationship between two different approximation techniques: the support vector machines (SVM), proposed by V. Vapnik (1995) and a sparse approximation scheme that resembles the basis pursuit denoising algorithm (Chen, 1995; Chen, Donoho, & Saunders, 1995). SVM is a technique that can be derived from the structural risk minimization principle (Vapnik, 1982) and can be used to estimate the parameters of several different approximation schemes, including radial basis functions, algebraic and trigonometric polynomials, B-splines, and some forms of multilayer perceptrons. Basis pursuit denoising is a sparse approximation technique in which a function is reconstructed by using a small number of basis functions chosen from a large set (the dictionary). We show that if the data are noiseless, the modified version of basis pursuit denoising proposed in this article is equivalent to SVM in the following sense: if applied to the same data set, the two techniques give the same solution, which is obtained by solving the same quadratic programming problem. In the appendix, we present a derivation of the SVM technique in the framework of regularization theory, rather than statistical learning theory, establishing a connection between SVM, sparse approximation, and regularization theory.Keywords
This publication has 11 references indexed in Scilit:
- Emergence of simple-cell receptive field properties by learning a sparse code for natural imagesNature, 1996
- Development of low entropy coding in a recurrent networkNetwork: Computation in Neural Systems, 1996
- Support-vector networksMachine Learning, 1995
- Regularization Theory and Neural Networks ArchitecturesNeural Computation, 1995
- Matching pursuits with time-frequency dictionariesIEEE Transactions on Signal Processing, 1993
- Entropy-based algorithms for best basis selectionIEEE Transactions on Information Theory, 1992
- Fast Learning in Networks of Locally-Tuned Processing UnitsNeural Computation, 1989
- Interpolation of scattered data: Distance matrices and conditionally positive definite functionsConstructive Approximation, 1986
- Positive definite functions and generalizations, an historical surveyRocky Mountain Journal of Mathematics, 1976
- Theory of Reproducing KernelsTransactions of the American Mathematical Society, 1950