An Equivalence Between Sparse Approximation and Support Vector Machines

1 August 1998

journal article
Published by MIT Press in Neural Computation

Vol. 10 (6) , 1455-1480
https://doi.org/10.1162/089976698300017269

Abstract

This article shows a relationship between two different approximation techniques: the support vector machines (SVM), proposed by V. Vapnik (1995) and a sparse approximation scheme that resembles the basis pursuit denoising algorithm (Chen, 1995; Chen, Donoho, & Saunders, 1995). SVM is a technique that can be derived from the structural risk minimization principle (Vapnik, 1982) and can be used to estimate the parameters of several different approximation schemes, including radial basis functions, algebraic and trigonometric polynomials, B-splines, and some forms of multilayer perceptrons. Basis pursuit denoising is a sparse approximation technique in which a function is reconstructed by using a small number of basis functions chosen from a large set (the dictionary). We show that if the data are noiseless, the modified version of basis pursuit denoising proposed in this article is equivalent to SVM in the following sense: if applied to the same data set, the two techniques give the same solution, which is obtained by solving the same quadratic programming problem. In the appendix, we present a derivation of the SVM technique in the framework of regularization theory, rather than statistical learning theory, establishing a connection between SVM, sparse approximation, and regularization theory.

Keywords

This publication has 11 references indexed in Scilit:

Emergence of simple-cell receptive field properties by learning a sparse code for natural images
Nature, 1996
Development of low entropy coding in a recurrent network
Network: Computation in Neural Systems, 1996
Support-vector networks
Machine Learning, 1995
Regularization Theory and Neural Networks Architectures
Neural Computation, 1995
Matching pursuits with time-frequency dictionaries
IEEE Transactions on Signal Processing, 1993
Entropy-based algorithms for best basis selection
IEEE Transactions on Information Theory, 1992
Fast Learning in Networks of Locally-Tuned Processing Units
Neural Computation, 1989
Interpolation of scattered data: Distance matrices and conditionally positive definite functions
Constructive Approximation, 1986
Positive definite functions and generalizations, an historical survey
Rocky Mountain Journal of Mathematics, 1976
Theory of Reproducing Kernels
Transactions of the American Mathematical Society, 1950