Gene Family Evolution across 12 Drosophila Genomes
Top Cited Papers
Open Access
- 9 November 2007
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Genetics
- Vol. 3 (11) , e197-2146
- https://doi.org/10.1371/journal.pgen.0030197
Abstract
Comparison of whole genomes has revealed large and frequent changes in the size of gene families. These changes occur because of high rates of both gene gain (via duplication) and loss (via deletion or pseudogenization), as well as the evolution of entirely new genes. Here we use the genomes of 12 fully sequenced Drosophila species to study the gain and loss of genes at unprecedented resolution. We find large numbers of both gains and losses, with over 40% of all gene families differing in size among the Drosophila. Approximately 17 genes are estimated to be duplicated and fixed in a genome every million years, a rate on par with that previously found in both yeast and mammals. We find many instances of extreme expansions or contractions in the size of gene families, including the expansion of several sex- and spermatogenesis-related families in D. melanogaster that also evolve under positive selection at the nucleotide level. Newly evolved gene families in our dataset are associated with a class of testes-expressed genes known to have evolved de novo in a number of cases. Gene family comparisons also allow us to identify a number of annotated D. melanogaster genes that are unlikely to encode functional proteins, as well as to identify dozens of previously unannotated D. melanogaster genes with conserved homologs in the other Drosophila. Taken together, our results demonstrate that the apparent stasis in total gene number among species has masked rapid turnover in individual gene gain and loss. It is likely that this genomic revolving door has played a large role in shaping the morphological, physiological, and metabolic differences among species. Though comparative genome sequencing has revealed vast similarities in the total number of genes contained within closely related species, this similarity hides enormous complexities in the identity and number of constituent proteins. Species can differ in their complement of genes through both gene duplication and loss. Here we investigated the gain and loss of genes from the genomes of 12 fully sequenced Drosophila (fruit flies). We find high rates of gain and loss in all species and estimate that approximately one new gene is gained or lost every 60,000 years. We also find several hundred cases of extremely rapid gene turnover, with dozens of genes gained or lost in only a few million years. The highest turnover in gene number occurs in genes involved in sex and reproduction. Taken together, our results demonstrate that the apparent stasis in total gene number among species has masked rapid turnover in individual gene gain and loss. It is likely that this evolutionary revolving door has played a large role in shaping the morphological, physiological, and metabolic differences among species.Keywords
This publication has 81 references indexed in Scilit:
- Discovery of functional elements in 12 Drosophila genomes using evolutionary signaturesNature, 2007
- Population Genomics: Whole-Genome Analysis of Polymorphism and Divergence in Drosophila simulansPLoS Biology, 2007
- Bias in phylogenetic tree reconciliation methods: implications for vertebrate genome evolutionGenome Biology, 2007
- Widespread Discordance of Gene Trees with Species Tree in Drosophila: Evidence for Incomplete Lineage SortingPLoS Genetics, 2006
- Repression and loss of gene expression outpaces activation and gain in recently duplicated fly genesProceedings of the National Academy of Sciences, 2006
- Novel genes derived from noncoding DNA in Drosophila melanogaster are frequently X-linked and exhibit testis-biased expressionProceedings of the National Academy of Sciences, 2006
- Lineage-Specific Gene Duplication and Loss in Human and Great Ape EvolutionPLoS Biology, 2004
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- The Pfam protein families databaseNucleic Acids Research, 2004
- The Genome Sequence of Drosophila melanogasterScience, 2000