Improvement of whole-genome annotation of cereals through comparative analyses
Open Access
- 6 February 2007
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 17 (3) , 299-310
- https://doi.org/10.1101/gr.5881807
Abstract
Rice is an important model species for the Poaceae and other monocotyledonous plants. With the availability of a near-complete, finished, and annotated rice genome, we performed genome level comparisons between rice and all plant species in which large genomic or transcriptomic data sets are available to determine the utility of cross-species sequence for structural and functional annotation of the rice genome. Through comparative analyses with four plant genome sequence data sets and transcript assemblies from 185 plant species, we were able to confirm and improve the structural annotation of the rice genome. Support for 38,109 (89.3%) of the total 42,653 nontransposable element-related genes in the rice genome in the form of a rice expressed sequence tag, full-length cDNA, or plant homolog from our comparative analyses could be found. Although the majority of the putative homologs were obtained from Poaceae species, putative homologs were identified in dicotyledonous angiosperms, gymnosperms, and other plants such as algae, moss, and fern. A set of rice genes (7669) lacking a putative homolog was identified which may be lineage-specific genes that evolved after speciation and have a role in species diversity. Improvements to the current rice gene structural annotation could be identified from our comparative alignments and we were able to identify 487 genes which were mostly likely missed in the current rice genome annotation and another 500 genes for structural annotation review. We were able to demonstrate the utility of cross-species comparative alignments in the identification of noncoding sequences and in confirmation of gene nesting in rice.Keywords
This publication has 59 references indexed in Scilit:
- The TIGR Plant Transcript Assemblies databaseNucleic Acids Research, 2006
- Approaches to microRNA discoveryNature Genetics, 2006
- Genomewide comparative analysis of alternative splicing in plantsProceedings of the National Academy of Sciences, 2006
- Using Multiple Alignments to Improve Gene PredictionJournal of Computational Biology, 2006
- The map-based sequence of the rice genomeNature, 2005
- The Genomes of Oryza sativa: A History of DuplicationsPLoS Biology, 2005
- Sorghum Genome Sequencing by Methylation FiltrationPLoS Biology, 2005
- Improving the Arabidopsis genome annotation using maximal transcript alignment assembliesNucleic Acids Research, 2003
- MicroRNAs in plantsGenes & Development, 2002
- Non–coding RNA genes and the modern RNA worldNature Reviews Genetics, 2001