Principal Component Analysis and Large-Scale Correlations in Non-Coding Sequences of Human DNA
- 1 January 1996
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 3 (4) , 573-576
- https://doi.org/10.1089/cmb.1996.3.573
Abstract
We have calculated a full set of second-order correlation functions of nucleotides in noncoding DNA. They are found to be independently invariant in regard to permutations of A and T, and also C and G. Considering correlation functions as a 4 × 4 matrix with a symmetrical basis, we have found the principal components—objects with zero cross-correlations. These three principal components are present the base compositions: (A + T − C − G), (A − T), (C − G). The long-range behavior of these principal components yields power-law dependencies with different critical exponents. Key words: long-range correlations, DNA, principal component analysisKeywords
This publication has 5 references indexed in Scilit:
- Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysisPhysical Review E, 1995
- Base compositional structure of genomesGenomics, 1992
- Evolution of long-range fractal correlations and 1/fnoise in DNA base sequencesPhysical Review Letters, 1992
- Long-range correlations in nucleotide sequencesNature, 1992
- Long-Range Correlation and Partial 1/ f α Spectrum in a Noncoding DNA SequenceEurophysics Letters, 1992