Evolution of protein structural classes and protein sequence families
- 19 September 2006
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 103 (38) , 14056-14061
- https://doi.org/10.1073/pnas.0606239103
Abstract
In protein structure space, protein structures cluster into four elongated regions when mapped based solely on similarity among the 3D structures. These four regions correspond to the four major classes of present-day proteins defined by the contents of secondary structure types and their topological arrangement. Evolution of and restriction to these four classes suggest that, in most cases, the evolution of genes may have been constrained or selected to those genetic changes that results in structurally stable proteins occupying one of the four “allowed” regions of the protein structure space, “structural selection,” an important component of natural selection in gene evolution. Our studies on tracing the “common structural ancestor” for each protein sequence family of known structure suggest that: ( i ) recently emerged proteins belong mostly to three classes; ( ii ) the proteins that emerged earlier evolved to gain a new class; and ( iii ) the proteins that emerged earliest evolved to become the present-day proteins in the four major classes, with the fourth-class proteins becoming the most dominant population. Furthermore, our studies also show that not all present-day proteins evolved from one single set of proteins in the last common ancestral organism, but new common ancestral proteins were “born” at different evolutionary times, not traceable to one or two ancestral proteins: “the multiple birth model” for the evolution of protein sequence families.Keywords
This publication has 26 references indexed in Scilit:
- SCOP: A structural classification of proteins database for the investigation of sequences and structuresPublished by Elsevier ,2006
- How old is your fold?Bioinformatics, 2005
- The Pfam protein families databaseNucleic Acids Research, 2004
- The structure of the protein universe and genome evolutionNature, 2002
- Estimating the number of protein folds and families from complete genome data 1 1Edited by J. ThorntonJournal of Molecular Biology, 2000
- CATH – a hierarchic classification of protein domain structuresPublished by Elsevier ,1997
- Analysis of Domain Structural Class Using an Automated Class Assignment ProtocolJournal of Molecular Biology, 1996
- Enlarged representative set of protein structuresProtein Science, 1994
- NoticesCladistics, 1989
- The combinatorial distance geometry method for the calculation of molecular conformation. I. A new approach to an old problemJournal of Theoretical Biology, 1983