Separating Style and Content with Bilinear Models
Top Cited Papers
- 1 June 2000
- journal article
- Published by MIT Press in Neural Computation
- Vol. 12 (6) , 1247-1283
- https://doi.org/10.1162/089976600300015349
Abstract
Perceptual systems routinely separate “content” from “style,” classifying familiar words spoken in an unfamiliar accent, identifying a font or handwriting style across letters, or recognizing a familiar face or object seen under unfamiliar viewing conditions. Yet a general and tractable computational model of this ability to untangle the underlying factors of perceptual observations remains elusive (Hofstadter, 1985). Existing factor models (Mardia, Kent, & Bibby, 1979; Hinton & Zemel, 1994; Ghahramani, 1995; Bell & Sejnowski, 1995; Hinton, Dayan, Frey, & Neal, 1995; Dayan, Hinton, Neal, & Zemel, 1995; Hinton & Ghahramani, 1997) are either insufficiently rich to capture the complex interactions of perceptually meaningful factors such as phoneme and speaker accent or letter and font, or do not allow efficient learning algorithms. We present a general framework for learning to solve two-factor tasks using bilinear models, which provide sufficiently expressive representations of factor interactions but can nonetheless be fit to data using efficient algorithms based on the singular value decomposition and expectation-maximization. We report promising results on three different tasks in three different perceptual domains: spoken vowel classification with a benchmark multi-speaker database, extrapolation of fonts to unseen letters, and translation of faces to novel illuminants.Keywords
This publication has 25 references indexed in Scilit:
- Generative models for discovering sparse distributed representationsPhilosophical Transactions Of The Royal Society B-Biological Sciences, 1997
- Statistical Approach to Shape from Shading: Reconstruction of Three-Dimensional Face Surfaces from Single Two-Dimensional ImagesNeural Computation, 1996
- Image Representations for Visual LearningScience, 1996
- Discriminant adaptive nearest neighbor classificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1996
- An Information-Maximization Approach to Blind Separation and Blind DeconvolutionNeural Computation, 1995
- The Helmholtz MachineNeural Computation, 1995
- The "Wake-Sleep" Algorithm for Unsupervised Neural NetworksScience, 1995
- Connectionist generalization for production: An example from GridFontNeural Networks, 1992
- Color constancy: surface color from changing illuminationJournal of the Optical Society of America A, 1992
- Encoding of Spatial Location by Posterior Parietal NeuronsScience, 1985