Predicting Coding Potential from Genome Sequence: Application to Betaherpesviruses Infecting Rats and Mice
Open Access
- 15 June 2005
- journal article
- research article
- Published by American Society for Microbiology in Journal of Virology
- Vol. 79 (12) , 7570-7596
- https://doi.org/10.1128/jvi.79.12.7570-7596.2005
Abstract
Prediction of protein-coding regions and other features of primary DNA sequence have greatly contributed to experimental biology. Significant challenges remain in genome annotation methods, including the identification of small or overlapping genes and the assessment of mRNA splicing or unconventional translation signals in expression. We have employed a combined analysis of compositional biases and conservation together with frame-specific G+C representation to reevaluate and annotate the genome sequences of mouse and rat cytomegaloviruses. Our analysis predicts that there are at least 34 protein-coding regions in these genomes that were not apparent in earlier annotation efforts. These include 17 single-exon genes, three new exons of previously identified genes, a newly identified four-exon gene for a lectin-like protein (in rat cytomegalovirus), and 10 probable frameshift extensions of previously annotated genes. This expanded set of candidate genes provides an additional basis for investigation in cytomegalovirus biology and pathogenesis.Keywords
This publication has 40 references indexed in Scilit:
- Immune escape and exploitation strategies of cytomegaloviruses: impact on and imitation of the major histocompatibility systemCellular Microbiology, 2004
- Genetic content of wild-type human cytomegalovirusJournal of General Virology, 2004
- The human cytomegalovirus genome revisited: comparison with the chimpanzee cytomegalovirus genome FN1Journal of General Virology, 2003
- Antigens and immunoevasins: opponents in cytomegalovirus immune surveillanceNature Reviews Immunology, 2002
- Analysis and Characterization of the Complete Genome of Tupaia (Tree Shrew) HerpesvirusJournal of Virology, 2001
- Identification and Characterization of a Spliced C-Type Lectin-Like Gene Encoded by Rat CytomegalovirusJournal of Virology, 2001
- A symmetric-iterated multiple alignment of protein sequencesJournal of Molecular Biology, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Biology of Rat Cytomegalovirus InfectionIntervirology, 1985
- The murine cytomegalovirus as a model for the study of viral pathogenesis and persistent infectionsArchiv für die gesamte Virusforschung, 1979