Matlnd and Matlnspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data
- 1 January 1995
- journal article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 23 (23) , 4878-4884
- https://doi.org/10.1093/nar/23.23.4878
Abstract
The identification of potential regulatory motifs in new sequence data is increasingly important for experimental design. Those motifs are commonly located by matches to IUPAC strings derived from consensus sequences. Although this method is simple and widely used, a major drawback of IUPAC strings is that they necessarily remove much of the information originally present in the set of sequences. Nucleotide distribution matrices retain most of the information and are thus better suited to evaluate new potential sites. However, sufficiently large libraries of pre-compiled matrices are a prerequisite for practical application of any matrix-based approach and are just beginning to emerge. Here we present a set of tools for molecular biologists that allows generation of new matrices and detection of potential sequence matches by automatic searches with a library of pre-compiled matrices. We also supply a large library (> 200) of transcription factor binding site matrices that has been compiled on the basis of published matrices as well as entries from the TRANSFAC database, with emphasis on sequences with experimentally verified binding capacity. Our search method includes position weighting of the matrices based on the information content of individual positions and calculates a relative matrix similarity. We show several examples suggesting that this matrix similarity is useful in estimating the functional potential of matrix matches and thus provides a valuable basis for designing appropriate experiments.Keywords
This publication has 17 references indexed in Scilit:
- Recognition of regulatory regions in genomic sequencesJournal of Biotechnology, 1994
- Identification of a Novel Glucocorticoid Response Element within the Genome of the Human Immunodeficiency Virus Type 1Virology, 1993
- SIGNAL SCAN 3.O. new database and program featuresBioinformatics, 1993
- Compilation of sequence-specific DNA-binding proteins implicated in transcriptional control in fungiNucleic Acids Research, 1993
- SIGNAL SCAN: a computer program that scans DNA sequences for eukaryotic transcriptional elementsBioinformatics, 1991
- Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequencesJournal of Molecular Biology, 1990
- Identification of consensus patterns in unaligned DNA sequences known to be functionally relatedBioinformatics, 1990
- Compilation of transcription regulating proteinsNucleic Acids Research, 1988
- Comparison of the consensus sequence flanking translational start sites inDrosophilaand vertebratesNucleic Acids Research, 1987
- Nomenclature for incompletely specified bases in nucleic acid sequences: rcommendations 1984Nucleic Acids Research, 1985