Search algorithm for pattern match analysis of nuleic add sequences
- 11 May 1983
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 11 (9) , 2943-2957
- https://doi.org/10.1093/nar/11.9.2943
Abstract
A new type of search algorithm to find biological information inherited 1n nucleic acid sequences was developed. The algorithm 1s of pattern natch type and is based on the fact that genetic Information often 1s a function of a predictable statistical occurence of the four bases within parts of the sequence. The search algorithm compares the known statistical pattern of bases in e.g. a promoter, with an unknown sequence and calculates the statistical significande of the match at all positions in the unknown sequence. The program was tested on 54 published prokaryotic promoters. 44 or 49 could be found with 1 or 4 false answers, respectively. The program was also used on plasmid pBR322. All promoters functioning 1n an 1n vitro transcription system were found (tet, anti-tet, p4, bla and ori) except the so called p5 promoter. A search for donor and acceptor sites was performed 1n a human HLA genomic sequence that contains six introns. Five of the possible six donor and acceptor sites were found.Keywords
This publication has 21 references indexed in Scilit:
- Spacer mutations in the lac ps promoter.Proceedings of the National Academy of Sciences, 1982
- Internal promoters of the rpoBC operon of Escherichia coliMolecular Genetics and Genomics, 1981
- Translational Initiation in ProkaryotesAnnual Review of Microbiology, 1981
- Organization and Expression of Eucaryotic Split Genes Coding for ProteinsAnnual Review of Biochemistry, 1981
- Nucleotide sequence of the proximal portion of the RNA polymerase β subunit gene of Escherichia coliGene, 1980
- E. coli RNA polymerase interacts homologously with two different promotersCell, 1980
- Evidence for Co-transcription of the RNA polymerase genes rpoBC with a ribosomal protein gene of Escherichia coliMolecular Genetics and Genomics, 1979
- Complete Nucleotide Sequence of the Escherichia coli Plasmid pBR322Published by Cold Spring Harbor Laboratory ,1979
- Identification of a single promoter in E. coli for rplJ, rplL and rpoBCNature, 1978
- Computer analysis of nucleic acid regulatory sequences.Proceedings of the National Academy of Sciences, 1977