Occurrence Probability of Structured Motifs in Random Sequences
- 1 December 2002
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 9 (6) , 761-773
- https://doi.org/10.1089/10665270260518254
Abstract
The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.Keywords
This publication has 7 references indexed in Scilit:
- Exact Distribution of the Distances between Any Occurrences of a Set of WordsAnnals of the Institute of Statistical Mathematics, 2001
- Algorithms for Extracting Structured Motifs Using a Suffix Tree with an Application to Promoter and Regulatory Site Consensus IdentificationJournal of Computational Biology, 2000
- Discovering regulatory elements in non-coding sequences by analysis of spaced dyadsNucleic Acids Research, 2000
- Exact distribution of word occurrences in a random sequence of lettersJournal of Applied Probability, 1999
- Non-canonical sequence elements in the promoter structure. Cluster analysis of promoters recognized by Escherichia coli RNA polymeraseNucleic Acids Research, 1997
- Compilation and analysus ofBacillus SubtilisσA-dependent promoter sequences: evidence for extended contact between RNA polymerse and upstream promoter DNANucleic Acids Research, 1995
- How many random digits are required until given sequences are obtained?Journal of Applied Probability, 1982