Modeling splice sites with Bayes networks
Open Access
- 1 February 2000
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 16 (2) , 152-158
- https://doi.org/10.1093/bioinformatics/16.2.152
Abstract
Motivation: The main goal in this paper is to develop accurate probabilistic models for important functional regions in DNA sequences (e.g. splice junctions that signal the beginning and end of transcription in human DNA). These methods can subsequently be utilized to improve the performance of gene-finding systems. The models built here attempt to model long-distance dependencies between non-adjacent bases. Results: An efficient modeling method is described which models biological data more accurately than a first-order Markov model without increasing the number of parameters. Intuitively, a small number of parameters helps a learning system to avoid overfitting. Several experiments with the model are presented, which show a small improvement in the average accuracy as compared with a simple Markov model. These experiments suggest that single long distance dependencies do not help the recognition problem, thus confirming several previous studies which have used more heuristic modeling techniques. Availability: This software is available for download and as a web resource at http://www.ai.uic.edu/software Contact: kasif@eecs.uic.eduKeywords
This publication has 0 references indexed in Scilit: