The Genomes of Oryza sativa: A History of Duplications
Top Cited Papers
Open Access
- 1 February 2005
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Biology
- Vol. 3 (2) , e38
- https://doi.org/10.1371/journal.pbio.0030038
Abstract
We report improved whole-genome shotgun sequences for the genomes of indica and japonica rice, both with multimegabase contiguity, or almost 1,000-fold improvement over the drafts of 2002. Tested against a nonredundant collection of 19,079 full-length cDNAs, 97.7% of the genes are aligned, without fragmentation, to the mapped super-scaffolds of one or the other genome. We introduce a gene identification procedure for plants that does not rely on similarity to known genes to remove erroneous predictions resulting from transposable elements. Using the available EST data to adjust for residual errors in the predictions, the estimated gene count is at least 38,000–40,000. Only 2%–3% of the genes are unique to any one subspecies, comparable to the amount of sequence that might still be missing. Despite this lack of variation in gene content, there is enormous variation in the intergenic regions. At least a quarter of the two sequences could not be aligned, and where they could be aligned, single nucleotide polymorphism (SNP) rates varied from as little as 3.0 SNP/kb in the coding regions to 27.6 SNP/kb in the transposable elements. A more inclusive new approach for analyzing duplication history is introduced here. It reveals an ancient whole-genome duplication, a recent segmental duplication on Chromosomes 11 and 12, and massive ongoing individual gene duplications. We find 18 distinct pairs of duplicated segments that cover 65.7% of the genome; 17 of these pairs date back to a common time before the divergence of the grasses. More important, ongoing individual gene duplications provide a never-ending source of raw material for gene genesis and are major contributors to the differences between members of the grass family.Keywords
This publication has 92 references indexed in Scilit:
- Vertebrate gene predictions and the problem of large genesNature Reviews Genetics, 2003
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- The Impact of Polyploidy on Grass Genome EvolutionPlant Physiology, 2002
- Splitting pairs: the diverging fates of duplicated genesNature Reviews Genetics, 2002
- Evolutionary dynamics of grass genomesNew Phytologist, 2002
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- A Whole-Genome Assembly of DrosophilaScience, 2000
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997
- Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL.Proceedings of the National Academy of Sciences, 1996
- Identification of the duplicated segments in rice chromosomes 1 and 5 by linkage analysis of cDNA markers of known functionsTheoretical and Applied Genetics, 1994