Striking Similarities in the Genomic Distribution of Tandemly Arrayed Genes in Arabidopsis and Rice
Open Access
- 1 September 2006
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 2 (9) , e115
- https://doi.org/10.1371/journal.pcbi.0020115
Abstract
In Arabidopsis, tandemly arrayed genes (TAGs) comprise >10% of the genes in the genome. These duplicated genes represent a rich template for genetic innovation, but little is known of the evolutionary forces governing their generation and maintenance. Here we compare the organization and evolution of TAGs between Arabidopsis and rice, two plant genomes that diverged ~150 million years ago. TAGs from the two genomes are similar in a number of respects, including the proportion of genes that are tandemly arrayed, the number of genes within an array, the number of tandem arrays, and the dearth of TAGs relative to single copy genes in centromeric regions. Analysis of recombination rates along rice chromosomes confirms a positive correlation between the occurrence of TAGs and recombination rate, as found in Arabidopsis. TAGs are also biased functionally relative to duplicated, nontandemly arrayed genes. In both genomes, TAGs are enriched for genes that encode membrane proteins and function in “abiotic and biotic stress” but underrepresented for genes involved in transcription and DNA or RNA binding functions. We speculate that these observations reflect an evolutionary trend in which successful tandem duplication involves genes either at the end of biochemical pathways or in flexible steps in a pathway, for which fluctuation in copy number is unlikely to affect downstream genes. Despite differences in the age distribution of tandem arrays, the striking similarities between rice and Arabidopsis indicate similar mechanisms of TAG generation and maintenance. The nuclear genomes of higher plants vary tremendously in size and gene content. Much of this variation is attributable to gene duplication. To date, most studies of plant gene duplication have focused on whole genome duplication events, which duplicate all genes simultaneously. Another prominent process is single gene duplication, which often results in duplicated genes arranged in a tandem array. Here Rizzon, Ponger, and Gaut identify tandem arrays in rice and their genome organization between Arabidopsis and rice, two plant species that diverged ~150 million years ago. The two genomes contain a similar proportion of genes that are tandemly arrayed, with a similar number of genes within an array. Moreover, tandemly arrayed genes are most common in genomic regions of high recombination in both species. This organization appears to be a general feature of eukaryotic genomes, perhaps because duplication rates are higher in high recombination regions. Tandemly arrayed genes of rice and Arabidopsis also represent a biased gene set with regard to function. In contrast to genes duplicated through whole genome events, tandemly arrayed genes are enriched for genes that encode membrane proteins and genes that function in response to environmental stresses. Taken together, these observations suggest that tandemly arrayed genes represent a rich and relatively fluid source for plant adaptation.Keywords
This publication has 52 references indexed in Scilit:
- Bias of Selection on Human Copy-Number VariantsPLoS Genetics, 2006
- Analysis of Homologous Gene Clusters inCaenorhabditis elegansReveals Striking Regional Cluster DomainsGenetics, 2006
- The map-based sequence of the rice genomeNature, 2005
- The Genomes of Oryza sativa: A History of DuplicationsPLoS Biology, 2005
- Evolutionary dynamics of an Arabidopsis insect resistance quantitative trait locusProceedings of the National Academy of Sciences, 2003
- Large-Scale Identification of Single-Feature Polymorphisms in Complex GenomesGenome Research, 2003
- Genetic Control of Natural Variation in Arabidopsis Glucosinolate AccumulationPlant Physiology, 2001
- Gene Duplication in the Diversification of Secondary Metabolism: Tandem 2-Oxoglutarate–Dependent Dioxygenases Control Glucosinolate Biosynthesis in ArabidopsisPlant Cell, 2001
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994