MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features
Top Cited Papers
Open Access
- 8 May 2007
- journal article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 35 (Web Server) , W339-W344
- https://doi.org/10.1093/nar/gkm368
Abstract
To distinguish the real pre-miRNAs from other hairpin sequences with similar stem-loops (pseudo pre-miRNAs), a hybrid feature which consists of local contiguous structure-sequence composition, minimum of free energy (MFE) of the secondary structure and P-value of randomization test is used. Besides, a novel machine-learning algorithm, random forest (RF), is introduced. The results suggest that our method predicts at 98.21% specificity and 95.09% sensitivity. When compared with the previous study, Triplet-SVM-classifier, our RF method was nearly 10% greater in total accuracy. Further analysis indicated that the improvement was due to both the combined features and the RF algorithm. The MiPred web server is available at http://www.bioinf.seu.edu.cn/miRNA/. Given a sequence, MiPred decides whether it is a pre-miRNA-like hairpin sequence or not. If the sequence is a pre-miRNA-like hairpin, the RF classifier will predict whether it is a real pre-miRNA or a pseudo one.Keywords
This publication has 27 references indexed in Scilit:
- microRNA: Past and presentFrontiers in Bioscience-Landmark, 2007
- Identification and Classification of Conserved RNA Secondary Structures in the Human GenomePLoS Computational Biology, 2006
- Combining multi-species genomic data for microRNA identification using a Naïve Bayes classifierBioinformatics, 2006
- MicroRNA identification based on sequence and structure alignmentBioinformatics, 2005
- Detection of 91 potential conserved plant microRNAs in Arabidopsis thaliana and Oryza sativa identifies important target genesProceedings of the National Academy of Sciences, 2004
- Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequencesBioinformatics, 2004
- Managing the genome: microRNAs in DrosophilaDifferentiation, 2004
- The microRNA RegistryNucleic Acids Research, 2004
- The nuclear RNase III Drosha initiates microRNA processingNature, 2003
- No evidence that mRNAs have lower folding free energies than random sequences with the same dinucleotide distributionNucleic Acids Research, 1999