Statistical analysis of the DNA sequence of human chromosome 22
- 26 September 2001
- journal article
- research article
- Published by American Physical Society (APS) in Physical Review E
- Vol. 64 (4) , 041917
- https://doi.org/10.1103/physreve.64.041917
Abstract
We study statistical patterns in the DNA sequence of human chromosome 22, the first completely sequenced human chromosome. We find that (i) the nucleotide long human chromosome exhibits long-range power-law correlations over more than four orders of magnitude, (ii) the entropies of the frequency distribution of oligonucleotides of length n (n-mers) grow sublinearly with increasing n, indicating the presence of higher-order correlations for all of the studied lengths and (iii) the generalized entropies of n-mers decrease monotonically with increasing q and the decay of with q becomes steeper with increasing indicating that the frequency distribution of oligonucleotides becomes increasingly nonuniform as the length n increases. We investigate to what degree known biological features may explain the observed statistical patterns. We find that (iv) the presence of interspersed repeats may cause the sublinear increase of with n, and that (v) the presence of monomeric tandem repeats as well as the suppression of CG dinucleotides may cause the observed decay of with q.
Keywords
This publication has 58 references indexed in Scilit:
- First Pass Annotation of Promoters on Human Chromosome 22Genome Research, 2001
- The Sequence of the Human GenomeScience, 2001
- A SNP Resource for Human Chromosome 22: Extracting Dense Clusters of SNPs From the Genomic SequenceGenome Research, 2001
- Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tagsProceedings of the National Academy of Sciences, 2000
- GT Repeats Are Associated with Recombination on Human Chromosome 22Genome Research, 2000
- The DNA sequence of human chromosome 21Nature, 2000
- Order and correlations in genomic DNA sequences. The spectral approachPhysics-Uspekhi, 2000
- The DNA sequence of human chromosome 22Nature, 1999
- Splice junctions follow a 205-base ladder.Proceedings of the National Academy of Sciences, 1991
- The pitch of chromatin DNA is reflected in its nucleotide sequence.Proceedings of the National Academy of Sciences, 1980