Metrics for comparing regulatory sequences on the basis of pattern counts
Open Access
- 12 February 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 20 (3) , 399-406
- https://doi.org/10.1093/bioinformatics/btg425
Abstract
Motivation: Upstream sequences contain short motifs, which mediate transcriptional regulation by specifically binding different transcription factors. The presence of common motifs in the regulatory regions of two genes might be considered as a clue for a potential co-regulation. A pattern count-based (dis)similarity metric between sequences could thus be used to classify genes according to their putative regulatory properties. Results: We present here several metrics which rely on probability theory, and which aim at comparing sequences on the basis of pattern counts. We compare these metrics to several classical dissimilarity and similarity metrics, and illustrate their behaviour with a biological example. Supplementary information: The data, results, and R routines used in this paper are freely available at http://rsat.ulb.ac.be/rsat/published_data/pattern_count_metrics_2003/Keywords
This publication has 0 references indexed in Scilit: