Widespread Endogenization of Genome Sequences of Non-Retroviral RNA Viruses into Plant Genomes
Open Access
- 14 July 2011
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Pathogens
- Vol. 7 (7) , e1002146
- https://doi.org/10.1371/journal.ppat.1002146
Abstract
Non-retroviral RNA virus sequences (NRVSs) have been found in the chromosomes of vertebrates and fungi, but not plants. Here we report similarly endogenized NRVSs derived from plus-, negative-, and double-stranded RNA viruses in plant chromosomes. These sequences were found by searching public genomic sequence databases, and, importantly, most NRVSs were subsequently detected by direct molecular analyses of plant DNAs. The most widespread NRVSs were related to the coat protein (CP) genes of the family Partitiviridae which have bisegmented dsRNA genomes, and included plant- and fungus-infecting members. The CP of a novel fungal virus (Rosellinia necatrix partitivirus 2, RnPV2) had the greatest sequence similarity to Arabidopsis thaliana ILR2, which is thought to regulate the activities of the phytohormone auxin, indole-3-acetic acid (IAA). Furthermore, partitivirus CP-like sequences much more closely related to plant partitiviruses than to RnPV2 were identified in a wide range of plant species. In addition, the nucleocapsid protein genes of cytorhabdoviruses and varicosaviruses were found in species of over 9 plant families, including Brassicaceae and Solanaceae. A replicase-like sequence of a betaflexivirus was identified in the cucumber genome. The pattern of occurrence of NRVSs and the phylogenetic analyses of NRVSs and related viruses indicate that multiple independent integrations into many plant lineages may have occurred. For example, one of the NRVSs was retained in Ar. thaliana but not in Ar. lyrata or other related Camelina species, whereas another NRVS displayed the reverse pattern. Our study has shown that single- and double-stranded RNA viral sequences are widespread in plant genomes, and shows the potential of genome integrated NRVSs to contribute to resolve unclear phylogenetic relationships of plant species. Eukaryotic genomes contain sequences that have originated from DNA viruses and reverse-transcribing viruses, i.e., retroviruses, pararetroviruses (DNA viruses), and transposons. However, the sequences of non-retroviral RNA viruses, which are unable to convert their genomes to DNA, were until recently considered not to be integrated into eukaryotic nuclear genomes. We present evidence for multiple independent events of horizontal gene transfer from a wide range of RNA viruses, including plus-sense, minus-sense, and double-stranded RNA viruses, into the genomes of distantly related plant lineages. Some non-retroviral integrated RNA viral sequences are conserved across genera within a plant family, whereas others are retained only in a limited number of species in a genus. Integration profiles of non-retroviral integrated RNA viral sequences demonstrate the potential of these sequences to serve as powerful molecular tools for deciphering phylogenetic relationships among related plants. Moreover, this study highlights plants co-opting non-retroviral RNA virus sequences, and provides insights into plant genome evolution and interplay between non-reverse-transcribing RNA viruses and their hosts.Keywords
This publication has 69 references indexed in Scilit:
- Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thalianaProceedings of the National Academy of Sciences, 2010
- The genome of the domesticated apple (Malus × domestica Borkh.)Nature Genetics, 2010
- New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0Systematic Biology, 2010
- Endogenous non-retroviral RNA virus elements in mammalian genomesNature, 2010
- Evolutionary Capture of Viral and Plasmid DNA by Yeast Nuclear ChromosomesEukaryotic Cell, 2009
- Recent developments in the MAFFT multiple sequence alignment programBriefings in Bioinformatics, 2008
- The Balance between Protein Synthesis and Degradation in Chloroplasts Determines Leaf Variegation inArabidopsis yellow variegatedMutantsPlant Cell, 2007
- ProtTest: selection of best-fit models of protein evolutionBioinformatics, 2005
- An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IIBotanical Journal of the Linnean Society, 2003
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997