Stochastic proximity embedding
- 9 June 2003
- journal article
- research article
- Published by Wiley in Journal of Computational Chemistry
- Vol. 24 (10) , 1215-1221
- https://doi.org/10.1002/jcc.10234
Abstract
We introduce stochastic proximity embedding (SPE), a novel self‐organizing algorithm for producing meaningful underlying dimensions from proximity data. SPE attempts to generate low‐dimensional Euclidean embeddings that best preserve the similarities between a set of related observations. The method starts with an initial configuration, and iteratively refines it by repeatedly selecting pairs of objects at random, and adjusting their coordinates so that their distances on the map match more closely their respective proximities. The magnitude of these adjustments is controlled by a learning rate parameter, which decreases during the course of the simulation to avoid oscillatory behavior. Unlike classical multidimensional scaling (MDS) and nonlinear mapping (NLM), SPE scales linearly with respect to sample size, and can be applied to very large data sets that are intractable by conventional embedding procedures. The method is programmatically simple, robust, and convergent, and can be applied to a wide range of scientific problems involving exploratory data analysis and visualization. © 2003 Wiley Periodicals, Inc. J Comput Chem 24: 1215–1221, 2003Keywords
This publication has 16 references indexed in Scilit:
- Combinatorial informatics in the post-genomics eraNature Reviews Drug Discovery, 2002
- Multidimensional scaling of combinatorial libraries without explicit enumerationJournal of Computational Chemistry, 2001
- Nonlinear Mapping NetworksJournal of Chemical Information and Computer Sciences, 2000
- Three-dimensional alpha shapesACM Transactions on Graphics, 1994
- The Molecular Connectivity Chi Indexes and Kappa Shape Indexes in Structure‐Property ModelingReviews in Computational Chemistry, 1991
- Learning representations by back-propagating errorsNature, 1986
- Improving the efficiency of Sammon's nonlinear mapping by using clustering archetypesElectronics Letters, 1978
- A Nonlinear Mapping for Data Structure AnalysisIEEE Transactions on Computers, 1969
- Nonmetric Multidimensional Scaling: A Numerical MethodPsychometrika, 1964
- A Stochastic Approximation MethodThe Annals of Mathematical Statistics, 1951