Data clustering and noise undressing of correlation matrices
- 15 May 2001
- journal article
- Published by American Physical Society (APS) in Physical Review E
- Vol. 63 (6) , 061101
- https://doi.org/10.1103/physreve.63.061101
Abstract
We discuss a new approach to data clustering. We find that maximum likelihood leads naturally to an Hamiltonian of Potts variables which depends on the correlation matrix and whose low temperature behavior describes the correlation structure of the data. For random, uncorrelated data sets no correlation structure emerges. On the other hand for data sets with a built-in cluster structure, the method is able to detect and recover efficiently that structure. Finally we apply the method to financial time series, where the low temperature behavior reveals a non trivial clustering.Comment: 8 pages, 5 figures, completely rewritten and enlarged version of cond-mat/0003241. Submitted to Phys. Rev.Keywords
All Related Versions
This publication has 11 references indexed in Scilit:
- Identification of clusters of companies in stock indices via Potts super-paramagnetic transitionsPhysica A: Statistical Mechanics and its Applications, 2000
- Model for correlations in stock marketsPhysical Review E, 2000
- Distributions of singular values for some random matricesPhysical Review E, 1999
- Hierarchical structure in financial marketsZeitschrift für Physik B Condensed Matter, 1999
- Universal and Nonuniversal Properties of Cross Correlations in Financial Time SeriesPhysical Review Letters, 1999
- Noise Dressing of Financial Correlation MatricesPhysical Review Letters, 1999
- Superparamagnetic clustering of dataPhysical Review E, 1998
- Superparamagnetic Clustering of DataPhysical Review Letters, 1996
- Statistical mechanics and phase transitions in clusteringPhysical Review Letters, 1990
- Equation of State Calculations by Fast Computing MachinesThe Journal of Chemical Physics, 1953