The Role of Lineage-Specific Gene Family Expansion in the Evolution of Eukaryotes
Open Access
- 1 July 2002
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 12 (7) , 1048-1059
- https://doi.org/10.1101/gr.174302
Abstract
A computational procedure was developed for systematic detection of lineage-specific expansions (LSEs) of protein families in sequenced genomes and applied to obtain a census of LSEs in five eukaryotic species, the yeasts Saccharomyces cerevisiae andSchizosaccharomyces pombe, the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster, and the green plant Arabidopsis thaliana. A significant fraction of the proteins encoded in each of these genomes, up to 80% in A. thaliana, belong to LSEs. Many paralogous gene families in each of the analyzed species are almost entirely comprised of LSEs, indicating that their diversification occurred after the divergence of the major lineages of the eukaryotic crown group. The LSEs show readily discernible patterns of protein functions. The functional categories most prone to LSE are structural proteins, enzymes involved in an organism's response to pathogens and environmental stress, and various components of signaling pathways responsible for specificity, including ubiquitin ligase E3 subunits and transcription factors. The functions of several previously uncharacterized, vastly expanded protein families were predicted through in-depth protein sequence analysis, for example, small-molecule kinases and methylases that are expanded independently in the fly and in the nematode. The functions of several other major LSEs remain mysterious; these protein families are attractive targets for experimental discovery of novel, lineage-specific functions in eukaryotes. LSEs seem to be one of the principal means of adaptation and one of the most important sources of organizational and regulatory diversity in crown-group eukaryotes.[Supplemental material is available online at ftp://ncbi.nlm.nih.gov/pub/aravind/expansions, and http://www.genome.org.]Keywords
This publication has 56 references indexed in Scilit:
- Caloramator viterbensis sp. nov., a novel thermophilic, glycerol-fermenting bacterium isolated from a hot spring in ItalyInternational Journal of Systematic and Evolutionary Microbiology, 2002
- Siah ubiquitin ligase is structurally related to TRAF and modulates TNF-α signalingNature Structural & Molecular Biology, 2001
- Protein family and fold occurrence in genomes: power-law behaviour and evolutionary modelJournal of Molecular Biology, 2001
- RESISTANCEGENECOMPLEXES: Evolution and UtilizationAnnual Review of Phytopathology, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- T-coffee: a novel method for fast and accurate multiple sequence alignment 1 1Edited by J. ThorntonJournal of Molecular Biology, 2000
- Eukaryote-specific Domains in Translation Initiation Factors: Implications for Translation Regulation and Evolution of the Translation SystemGenome Research, 2000
- Origin of multicellular eukaryotes – insights from proteome comparisonsCurrent Opinion in Genetics & Development, 1999
- Fold prediction and evolutionary analysis of the POZ domain: structural and evolutionary relationship with the potassium channel tetramerization domain 1 1Edited by F. CohenJournal of Molecular Biology, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997