Box–Cox transformations in the analysis of compositional data
- 1 May 1991
- journal article
- research article
- Published by Wiley in Journal of Chemometrics
- Vol. 5 (3) , 227-239
- https://doi.org/10.1002/cem.1180050310
Abstract
The statistical analysis of compositional data is of fundamental importance to practitioners in general and to chemists in particular. The existing methodology is principally due to Aitchison, who effectively uses two transformations, a ratio followed by the logarithmic, to create a useful, coherent theory that in principle allows the plethora of normal‐based multivariate techniques to be used on the transformed data. This paper suggests that the well‐known class of Box–Cox transformations can be employed in place of the logarithmic to significantly improve the existing methodology. This is supported in part by showing that one of the most basic problems that Aitchison managed to overcome, namely the specification of an interpretable covariance structure for compositional data, can be resolved, or nearly resolved, once the ratio transformation has been applied. Hence the resolution is not directly dependent on the logarithmic transformation. It is then verified that access to the general Box–Cox family will allow a more accurate use of the normal‐based multivariate techniques, simply because better fits to normality can be achieved. Finally, maximum likelihood estimation and some associated asymptotics are employed to construct confidence intervals for ratios of the true, unknown compositional constituents. Heretofore this had not been done even in the context of the logarithmic transformation. Applications to real data are presented.Keywords
This publication has 13 references indexed in Scilit:
- A method for detecting potentially unreliable estimates when using the Burdick/Rayens model for estimating linear mixing proportionsJournal of Chemometrics, 1988
- Optimization of Solvent Composition for High Performance Thin-Layer ChromatographyJournal of Liquid Chromatography & Related Technologies, 1987
- Using discriminant analysis to estimate linear mixing proportionsJournal of Chemometrics, 1987
- Multivariate ObservationsPublished by Wiley ,1984
- A method for discriminating between models describing compositional dataBiometrika, 1982
- Distinction between Permian and post-Permian igneous rocks in the southern Sydney Basin, New South Wales, on the basis of major-element geochemistryMathematical Geology, 1981
- Approximation Theorems of Mathematical StatisticsPublished by Wiley ,1980
- Logistic-Normal Distributions: Some Properties and UsesBiometrika, 1980
- EDF Statistics for Goodness of Fit and Some ComparisonsJournal of the American Statistical Association, 1974
- On the Distribution of a Positive Random Variable Having a Discrete Probability Mass at the OriginJournal of the American Statistical Association, 1955