Using background knowledge to improve inductive learning of DNA sequences
- 17 December 2002
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 351-357
- https://doi.org/10.1109/caia.1994.323654
Abstract
Successful inductive learning requires that training data be expressed in a form where underlying regularities can be recognized by the learning system. Unfortunately, many applications of inductive learning-especially in the domain of molecular biology-have assumed that data are provided in a form already suitable for learning, whether or not such an assumption is actually justified. This paper describes the use of background knowledge of molecular biology to re-express data into a form more appropriate for learning. Our results show dramatic improvements in classification accuracy for two very different classes of DNA sequences using traditional “off-the-sheIf” decision-tree and neural-network inductive-learning methodsKeywords
This publication has 17 references indexed in Scilit:
- Compilation ofE.colimRNA promoter sequencesNucleic Acids Research, 1993
- Effect of neural network input span on phoneme classificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1990
- Neural Network Models for Promoter RecognitionJournal of Biomolecular Structure and Dynamics, 1989
- Analysis of the occurrence of promoter-sites in DNANucleic Acids Research, 1986
- Rigorous pattern-recognition methods for DNA sequencesJournal of Molecular Biology, 1985
- Periodic Structurally Similar Oligomers are Found on One Side of the Axes of Symmetry in the lac, trp, and gal OperatorsJournal of Biomolecular Structure and Dynamics, 1984
- Escherichia colipromoter sequences predictin vitroRNA polymerase selectivityNucleic Acids Research, 1984
- Ovalbumin gene: evidence for a leader sequence in mRNA and DNA sequences at the exon-intron boundaries.Proceedings of the National Academy of Sciences, 1978
- MODEL-DIRECTED LEARNING OF PRODUCTION RULES11This work was supported by the Advanced Research Projects Agency under contract DAHC 15-73-C-0435, and by the National Institutes of Health under grant RR 00612–07.Published by Elsevier ,1978
- On the statistical significance of primary structural features found in DNA-protein interaction sitesNucleic Acids Research, 1975