A computational approach to identify genes for functional RNAs in genomic sequences
- 1 October 2001
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 29 (19) , 3928-3938
- https://doi.org/10.1093/nar/29.19.3928
Abstract
Currently there is no successful computational approach for identification of genes encoding novel functional RNAs (fRNAs) in genomic sequences. We have developed a machine learning approach using neural networks and support vector machines to extract common features among known RNAs for prediction of new RNA genes in the unannotated regions of prokaryotic and archaeal genomes. The Escherichia coli genome was used for development, but we have applied this method to several other bacterial and archaeal genomes. Networks based on nucleotide composition were 80-90% accurate in jackknife testing experiments for bacteria and 90-99% for hyperthermophilic archaea. We also achieved a significant improvement in accuracy by combining these predictions with those obtained using a second set of parameters consisting of known RNA sequence motifs and the calculated free energy of folding. Several known fRNAs not included in the training datasets were identified as well as several hundred predicted novel RNAs. These studies indicate that there are many unidentified RNAs in simple genomes that can be predicted computationally as a precursor to experimental study. Public access to our RNA gene predictions and an interface for user predictions is available via the web.Keywords
This publication has 47 references indexed in Scilit:
- The crystal structure of UUCG tetraloop 1 1Edited by J DoudnaJournal of Molecular Biology, 2000
- Improved microbial gene identification with GLIMMERNucleic Acids Research, 1999
- A Computational Screen for Methylation Guide snoRNAs in YeastScience, 1999
- The Complete Genome Sequence of Escherichia coli K-12Science, 1997
- Transfer RNA gene redundancy and translational selection in Saccharomyces cerevisiae 1 1Edited by J. KarnJournal of Molecular Biology, 1997
- tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic SequenceNucleic Acids Research, 1997
- A Network of Heterogeneous Hydrogen Bonds in GNRA TetraloopsJournal of Molecular Biology, 1996
- Very Fast Identification of RNA Motifs in Genomic DNA. Application to tRNA Search in the Yeast GenomeJournal of Molecular Biology, 1996
- Basic local alignment search toolJournal of Molecular Biology, 1990
- A program for predicting significant RNA secondary structuresBioinformatics, 1988