Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information
- 1 September 1996
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 24 (17) , 3439-3452
- https://doi.org/10.1093/nar/24.17.3439
Abstract
Artificial neural networks have been combined with a rule based system to predict intron splice sites in the dicot plant Arabidopsis thaliana. A two step prediction scheme, where a global prediction of the coding potential regulates a cutoff level for a local prediction of splice sites, is refined by rules based on splice site confidence values, prediction scores, coding context and distances between potential splice sites. In this approach, the prediction of splice sites mutually affect each other in a non-local manner. The combined approach drastically reduces the large amount of false positive splice sites normally haunting splice site prediction. An analysis of the errors made by the networks in the first step of the method revealed a previously unknown feature, a frequent T-tract prolongation containing cryptic acceptor sites in the 5' end of exons. The method presented here has been compared with three other approaches, GeneFinder, Gene-Mark and Grail. Overall the method presented here is an order of magnitude better. We show that the new method is able to find a donor site in the coding sequence for the jelly fish Green Fluorescent Protein, exactly at the position that was experimentally observed in A.thaliana transformants. Predictions for alternatively spliced genes are also presented, together with examples of genes from other dicots, monocots and algae. The method has been made available through electronic mail (NetPlantGene@cbs.dtu.dk), or the WWW at http://www.cbs.dtu.dk/NetPlantGene.htmlKeywords
This publication has 24 references indexed in Scilit:
- Comparison of the predicted and observed secondary structure of T4 phage lysozymePublished by Elsevier ,2003
- Alternative splicing results in two different transcripts for H‐protein of the glycine cleavage system in the C4 species Flaveria trinerviaThe Plant Journal, 1995
- GFP in plantsTrends in Genetics, 1995
- PRUNING OF A LARGE NETWORK BY OPTIMAL BRAIN DAMAGE AND SURGEON: AN EXAMPLE FROM BIOLOGICAL SEQUENCE ANALYSISInternational Journal of Neural Systems, 1995
- Expression of maize Adh1 intron mutants in tobacco nucleiThe Plant Journal, 1993
- Prediction of human mRNA donor and acceptor sites from the DNA sequenceJournal of Molecular Biology, 1991
- Sequence logos: a new way to display consensus sequencesNucleic Acids Research, 1990
- Alternative mRNA splicing generates the two ribulosebisphosphate carboxylase/oxygenase activase polypeptides in spinach and Arabidopsis.Plant Cell, 1989
- The AU-rich sequences present in the introns of plant nuclear pre-mRNAs are required for splicingCell, 1989
- Translation framing code and frame-monitoring mechanism as suggested by the analysis of mRNA and 16 S rRNA nucleotide sequencesJournal of Molecular Biology, 1987