CONREAL: Conserved Regulatory Elements Anchored Alignment Algorithm for Identification of Transcription Factor Binding Sites by Phylogenetic Footprinting
Open Access
- 12 December 2003
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 14 (1) , 170-178
- https://doi.org/10.1101/gr.1642804
Abstract
Prediction of transcription-factor target sites in promoters remains difficult due to the short length and degeneracy of the target sequences. Although the use of orthologous sequences and phylogenetic footprinting approaches may help in the recognition of conserved and potentially functional sequences, correct alignment of the short transcription-factor binding sites can be problematic for established algorithms, especially when aligning more divergent species. Here, we report a novel phylogenetic footprinting approach, CONREAL, that uses biologically relevant information, that is, potential transcription-factor binding sites as represented by positional weight matrices, to establish anchors between orthologous sequences and to guide promoter sequence alignment. Comparison of the performance of CONREAL with the global alignment programs LAGAN and AVID using a reference data set, shows that CONREAL performs equally well for closely related species like rodents and human, and has a clear added value for aligning promoter elements of more divergent species like human and fish, as it identifies conserved transcription-factor binding sites that are not found by other methods. CONREAL is accessible via a Web interface at http://conreal.niob.knaw.nl/.Keywords
This publication has 41 references indexed in Scilit:
- Embryonic ε and γ globin genes of a prosimian primate (Galago crassicaudatus): Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprintsPublished by Elsevier ,2004
- LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNAGenome Research, 2003
- rVistafor Comparative Sequence-Based Discovery of Functional Transcription Factor Binding SitesGenome Research, 2002
- Location analysis of DNA‐bound proteins at the whole‐genome level: untangling transcriptional regulatory networksBioEssays, 2001
- Human-mouse genome comparisons to locate regulatory sitesNature Genetics, 2000
- PipMaker—A Web Server for Aligning Two Genomic DNA SequencesGenome Research, 2000
- In Vivo Cross-Linking and Immunoprecipitation for Studying Dynamic Protein:DNA Associations in a Chromatin EnvironmentMethods, 1999
- Locus control regions of mammalian β-globin gene clusters: combining phylogenetic analyses and experimental results to gain functional insightsGene, 1997
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- NUCLEASE HYPERSENSITIVE SITES IN CHROMATINAnnual Review of Biochemistry, 1988