TEnest: Automated Chronological Annotation and Visualization of Nested Plant Transposable Elements
Open Access
- 21 November 2007
- journal article
- Published by Oxford University Press (OUP) in Plant Physiology
- Vol. 146 (1) , 45-59
- https://doi.org/10.1104/pp.107.110353
Abstract
Organisms with a high density of transposable elements (TEs) exhibit nesting, with subsequent repeats found inside previously inserted elements. Nesting splits the sequence structure of TEs and makes annotation of repetitive areas challenging. We present TEnest, a repeat identification and display tool made specifically for highly repetitive genomes. TEnest identifies repetitive sequences and reconstructs separated sections to provide full-length repeats and, for long-terminal repeat (LTR) retrotransposons, calculates age since insertion based on LTR divergence. TEnest provides a chronological insertion display to give an accurate visual representation of TE integration history showing timeline, location, and families of each TE identified, thus creating a framework from which evolutionary comparisons can be made among various regions of the genome. A database of repeats has been developed for maize (Zea mays), rice (Oryza sativa), wheat (Triticum aestivum), and barley (Hordeum vulgare) to illustrate the potential of TEnest software. All currently finished maize bacterial artificial chromosomes totaling 29.3 Mb were analyzed with TEnest to provide a characterization of the repeat insertions. Sixty-seven percent of the maize genome was found to be made up of TEs; of these, 95% are LTR retrotransposons. The rate of solo LTR formation is shown to be dissimilar across retrotransposon families. Phylogenetic analysis of TE families reveals specific events of extreme TE proliferation, which may explain the high quantities of certain TE families found throughout the maize genome. The TEnest software package is available for use on PlantGDB under the tools section (http://www.plantgdb.org/prj/TE_nest/TE_nest.html); the source code is available from (http://wiselab.org).Keywords
This publication has 39 references indexed in Scilit:
- Physical and Genetic Structure of the Maize Genome Reflects Its Complex Evolutionary HistoryPLoS Genetics, 2007
- Types and Rates of Sequence Evolution at the High-Molecular-Weight Glutenin Locus in Hexaploid Wheat and Its Ancestral GenomesGenetics, 2006
- The map-based sequence of the rice genomeNature, 2005
- Repbase Update, a database of eukaryotic repetitive elementsCytogenetic and Genome Research, 2005
- De novo identification of repeat families in large genomesBioinformatics, 2005
- Combined Evidence Annotation of Transposable Elements in Genome SequencesPLoS Computational Biology, 2005
- The Maize Genome Contains a Helitron InsertionPlant Cell, 2003
- Initial sequencing and analysis of the human genomeNature, 2001
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequencesJournal of Molecular Evolution, 1980