On the distribution of the largest eigenvalue in principal components analysis
Top Cited Papers
Open Access
- 1 April 2001
- journal article
- research article
- Published by Institute of Mathematical Statistics in The Annals of Statistics
- Vol. 29 (2) , 295-327
- https://doi.org/10.1214/aos/1009210544
Abstract
Let x(1) denote the square of the largest singular value of an n × p matrix X, all of whose entries are independent standard Gaussian variates. Equivalently, x(1) is the largest principal component variance of the covariance matrix $X'X$, or the largest eigenvalue of a pvariate Wishart distribution on n degrees of freedom with identity covariance. Consider the limit of large p and n with $n/p = \gamma \ge 1$. When centered by $\mu_p = (\sqrt{n-1} + \sqrt{p})^2$ and scaled by $\sigma_p = (\sqrt{n-1} + \sqrt{p})(1/\sqrt{n-1} + 1/\sqrt{p}^{1/3}$, the distribution of x(1) approaches the Tracey-Widom law of order 1, which is defined in terms of the Painlevé II differential equation and can be numerically evaluated and tabulated in software. Simulations show the approximation to be informative for n and p as small as 5. The limit is derived via a corresponding result for complex Wishart matrices using methods from random matrix theory. The result suggests that some aspects of large p multivariate distribution theory may be easier to apply in practice than their fixed p counterparts.
Keywords
This publication has 31 references indexed in Scilit:
- Random matrix ensembles with an effective extensive external chargeJournal of Physics A: General Physics, 1998
- On fluctuations of eigenvalues of random Hermitian matricesDuke Mathematical Journal, 1998
- On orthogonal and symplectic matrix ensemblesCommunications in Mathematical Physics, 1996
- R. A. Fisher and multivariate analysisStatistical Science, 1996
- Penalized Discriminant AnalysisThe Annals of Statistics, 1995
- Level-spacing distributions and the Airy kernelCommunications in Mathematical Physics, 1994
- Uniform Asymptotic Expansions for Whittaker’s Confluent Hypergeometric FunctionsSIAM Journal on Mathematical Analysis, 1989
- A Limit Theorem for the Norm of Random MatricesThe Annals of Probability, 1980
- Distributions of Matrix Variates and Latent Roots Derived from Normal SamplesThe Annals of Mathematical Statistics, 1964
- Some Non-Central Distribution Problems in Multivariate AnalysisThe Annals of Mathematical Statistics, 1963