Almost all human genes resulted from ancient duplication
- 12 December 2006
- journal article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 103 (50) , 19027-19032
- https://doi.org/10.1073/pnas.0608796103
Abstract
Results of protein sequence comparison at open criterion show a very large number of relationships that have, up to now, gone unreported. The relationships suggest many ancient events of gene duplication. It is well known that gene duplication has been a major process in the evolution of genomes. A collection of human genes that have known functions have been examined for a history of gene duplications detected by means of amino acid sequence similarity by using BLASTp with an expectation of two or less (open criterion). Because the collection of genes in build 35 includes sets of transcript variants, all genes of known function were collected, and only the longest transcription variant was included, yielding a 13,298-member library called KGMV (for known genes maximum variant). When all lengths of matches are accepted, >97% of human genes show significant matches to each other. Many form matches with a large number of other different proteins, showing that most genes are made up from parts of many others as a result of ancient events of duplication. To support the use of the open criterion, all of the members of the KGMV library were twice replaced with random protein sequences of the same length and average composition, and all were compared with each other with BLASTp at expectation two or less. The set of matches averaged 0.35% of that observed for the KGMV set of proteins.Keywords
This publication has 6 references indexed in Scilit:
- A genome-wide comparison of recent chimpanzee and human segmental duplicationsNature, 2005
- Genes on human chromosome 19 show extreme divergence from the mouse orthologs and a high GC contentNucleic Acids Research, 2002
- Transduction of 3'-flanking sequences is common in L1 retrotranspositionHuman Molecular Genetics, 2000
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Rapid duplication and loss of genes coding for the alpha chains of hemoglobin.Proceedings of the National Academy of Sciences, 1980
- Evolution by Gene DuplicationPublished by Springer Nature ,1970