Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes
Open Access
- 15 September 2003
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 31 (18) , 5338-5348
- https://doi.org/10.1093/nar/gkg745
Abstract
Nucleotide substitution, insertion and deletion (indel) events are the major driving forces that have shaped genomes. Using the recently identified human ribosomal protein (RP) pseudogene sequences, we have thoroughly studied DNA mutation patterns in the human genome. We analyzed a total of 1726 processed RP pseudogene sequences, comprising more than 700 000 bases. To be sure to differentiate the sequence changes occurring in the functional genes during evolution from those occurring in pseudogenes after they were fixed in the genome, we used only pseudogene sequences originating from parts of RP genes that are identical in human and mouse. Overall, we found that nucleotide transitions are more common than transversions, by roughly a factor of two. Moreover, the substitution rates amongst the 12 possible nucleotide pairs are not homogeneous as they are affected by the type of immediately neighboring nucleotides and the overall local G+C content. Finally, our dataset is large enough that it has many indels, thus allowing for the first time statistically robust analysis of these events. Overall, we found that deletions are about three times more common than insertions (3740 versus 1291). The frequencies of both these events follow characteristic power–law behavior associated with the size of the indel. However, unexpectedly, the frequency of 3 bp deletions (in contrast to 3 bp insertions) violates this trend, being considerably higher than that of 2 bp deletions. The possible biological implications of such a 3 bp bias are discussed.Keywords
This publication has 27 references indexed in Scilit:
- Trinucleotide repeat DNA structures: dynamic mutations from dynamic DNACurrent Opinion in Structural Biology, 1998
- Patterns and rates of indel evolution in processed pseudogenes from humans and muridsGene, 1997
- Comparison of DNA Sequences with Protein SequencesGenomics, 1997
- Compositional differences within and between eukaryotic genomesProceedings of the National Academy of Sciences, 1997
- Structure and evolution of mammalian ribosomal proteinsBiochemistry and Cell Biology, 1995
- The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignmentJournal of Molecular Evolution, 1995
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- Evolutionary rates of insertion and deletion in noncoding nucleotide sequences of primates.Molecular Biology and Evolution, 1994
- Slipped-strand mispairing: a major mechanism for DNA sequence evolution.Molecular Biology and Evolution, 1987
- Neighboring base effects on substitution rates in pseudogenes.Molecular Biology and Evolution, 1986