Effective ambiguity checking in biosequence analysis
Open Access
- 20 June 2005
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 6 (1) , 153
- https://doi.org/10.1186/1471-2105-6-153
Abstract
Background: Ambiguity is a problem in biosequence analysis that arises in various analysis tasks solved via dynamic programming, and in particular, in the modeling of families of RNA secondary structures with stochastic context free grammars. Several types of analysis are invalidated by the presence of ambiguity. As this problem inherits undecidability (as we show here) from the namely problem for context free languages, there is no complete algorithmic solution to the problem of ambiguity checking. Results: We explain frequently observed sources of ambiguity, and show how to avoid them. We suggest four testing procedures that may help to detect ambiguity when present, including a just-in-time test that permits to work safely with a potentially ambiguous grammar. We introduce, for the special case of stochastic context free grammars and RNA structure modeling, an automated partial procedure for proving non-ambiguity. It is used to demonstrate non-ambiguity for several relevant grammars. Conclusion: Our mechanical proof procedure and our testing methods provide a powerful arsenal of methods to ensure non-ambiguity.Keywords
This publication has 7 references indexed in Scilit:
- Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure predictionBMC Bioinformatics, 2004
- A discipline of dynamic programming over sequence dataScience of Computer Programming, 2004
- Complete suboptimal folding of RNA and the stability of secondary structuresBiopolymers, 1999
- Optimal computer folding of large RNA sequences using thermodynamics and auxiliary informationNucleic Acids Research, 1981
- On the translation of languages from left to rightInformation and Control, 1965
- The Algebraic Theory of Context-Free LanguagesPublished by Elsevier ,1963
- Three models for the description of languageIEEE Transactions on Information Theory, 1956