CART Classification of Human 5' UTR Sequences
Open Access
- 1 November 2000
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 10 (11) , 1807-1816
- https://doi.org/10.1101/gr.gr-1460r
Abstract
A nonredundant database of 2312 full-length human 5′-untranslated regions (UTRs) was carefully prepared using state-of-the-art experimental and computational technologies. A comprehensive computational analysis of this data was conducted for characterizing the 5′ UTR features. Classification and regression tree (CART) analysis was used to classify the data into three distinct classes. Class I consists of mRNAs that are believed to be poorly translated with long 5′ UTRs filled with potential inhibitory features. Class II consists of terminal oligopyrimidine tract (TOP) mRNAs that are regulated in a growth-dependent manner, and class III consists of mRNAs with favorable 5′ UTR features that may help efficient translation. The most accurate tree we found has 92.5% classification accuracy as estimated by cross validation. The classification model included the presence of TOP, a secondary structure, 5′ UTR length, and the presence of upstream AUGs (uAUGs) as the most relevant variables. The present classification and characterization of the 5′ UTRs provide precious information for better understanding the translational regulation of human mRNAs. Furthermore, this database and classification can help people build better computational models for predicting the 5′-terminal exon and separating the 5′ UTR from the coding region.Keywords
This publication has 29 references indexed in Scilit:
- Statistical Analysis of the 5′ Untranslated Region of Human mRNA Using “Oligo-Capped” cDNA LibrariesGenomics, 2000
- UTRdb and UTRsite: specialized databases of sequences and functional elements of 5' and 3' untranslated regions of eukaryotic mRNAsNucleic Acids Research, 2000
- Control of gene expression at the level of translation initiationCurrent Opinion in Biotechnology, 1994
- Enhanced translational efficiency of a novel transforming growth factor beta 3 mRNA in human breast cancer cells.Molecular and Cellular Biology, 1994
- Oligo-capping: a simple method to replace the cap structure of eukaryotic mRNAs with oligoribonucleotidesGene, 1994
- Mechanism and regulation of eukaryotic protein synthesis.1992
- Structural features in eukaryotic mRNAs that modulate the initiation of translation.Journal of Biological Chemistry, 1991
- Circumstances and mechanisms of inhibition of translation by secondary structure in eucaryotic mRNAs.Molecular and Cellular Biology, 1989
- The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applicationsNucleic Acids Research, 1987
- Insertion mutagenesis to increase secondary structure within the 5′ noncoding region of a eukaryotic mRNA reduces translational efficiencyCell, 1985