Relationships between the Definition of the Hyperplane Width to the Fidelity of Principal Component Loading Patterns

Open Access

1 June 1999

journal article
research article
Published by American Meteorological Society in Journal of Climate

Vol. 12 (6) , 1557-1576
https://doi.org/10.1175/1520-0442(1999)012<1557:rbtdot>2.0.co;2

Abstract

When applying eigenanalysis, one decision analysts make is the determination of what magnitude an eigenvector coefficient (e.g., principal component (PC) loading) must achieve to be considered as physically important. Such coefficients can be displayed on maps or in a time series or tables to gain a fuller understanding of a large array of multivariate data. Previously, such a decision on what value of loading designates a useful signal (hereafter called the loading “cutoff”) for each eigenvector has been purely subjective. The importance of selecting such a cutoff is apparent since those loading elements in the range of zero to the cutoff are ignored in the interpretation and naming of PCs since only the absolute values of loadings greater than the cutoff are physically analyzed. This research sets out to objectify the problem of best identifying the cutoff by application of matching between known correlation/covariance structures and their corresponding eigenpatterns, as this cutoff point (known as the hyperplane width) is varied. A Monte Carlo framework is used to resample at five sample sizes. Fourteen different hyperplane cutoff widths are tested, bootstrap resampled 50 times to obtain stable results. The key findings are that the location of an optimal hyperplane cutoff width (one which maximized the information content match between the eigenvector and the parent dispersion matrix from which it was derived) is a well-behaved unimodal function. On an individual eigenvector, this enables the unique determination of a hyperplane cutoff value to be used to separate those loadings that best reflect the relationships from those that do not. The effects of sample size on the matching accuracy are dramatic as the values for all solutions (i.e., unrotated, rotated) rose steadily from 25 through 250 observations and then weakly thereafter. The specific matching coefficients are useful to assess the penalties incurred when one analyzes eigenvector coefficients of a lower absolute value than the cutoff (termed coefficient in the hyperplane) or, alternatively, chooses not to analyze coefficients that contain useful physical signal outside of the hyperplane. Therefore, this study enables the analyst to make the best use of the information available in their PCs to shed light on complicated data structures.

This publication has 6 references indexed in Scilit:

Linear Relation of Central and Eastern North American Precipitation to Tropical Pacific Sea Surface Temperature Anomalies
Journal of Climate, 1997
Loading and correlations in the interpretation of principle compenents
Journal of Applied Statistics, 1995
Pattern analysis of growing season precipitation in Southern Canada
Atmosphere-Ocean, 1987
Classification, Seasonality and Persistence of Low-Frequency Atmospheric Circulation Patterns
Monthly Weather Review, 1987
Rotation of principal components
Journal of Climatology, 1986
Climatic Pattern Analysis of Three- and Seven-Day Summer Rainfall in the Central United States: Some Methodological Considerations and a Regionalization
Journal of Climate and Applied Meteorology, 1985