Computerized polymorphic marker identification: Experimental validation and a predicted human polymorphism catalog
Open Access
- 23 June 1998
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 95 (13) , 7514-7519
- https://doi.org/10.1073/pnas.95.13.7514
Abstract
A computational system for the prediction of polymorphic loci directly and efficiently from human genomic sequence was developed and verified. A suite of programs, collectively called pompous (polymorphic marker prediction of ubiquitous simple sequences) detects tandem repeats ranging from dinucleotides up to 250 mers, scores them according to predicted level of polymorphism, and designs appropriate flanking primers for PCR amplification. This approach was validated on an approximately 750-kilobase region of human chromosome 3p21.3, involved in lung and breast carcinoma homozygous deletions. Target DNA from 36 paired B lymphoblastoid and lung cancer lines was amplified and allelotyped for 33 loci predicted by pompous to be variable in repeat size. We found that among those 36 predominately Caucasian individuals 22 of the 33 (67%) predicted loci were polymorphic with an average heterozygosity of 0.42. Allele loss in this region was found in 27/36 (75%) of the tumor lines using these markers. pompous provides the genetic researcher with an additional tool for the rapid and efficient identification of polymorphic markers, and through a World Wide Web site, investigators can use pompous to identify polymorphic markers for their research. A catalog of 13,261 potential polymorphic markers and associated primer sets has been created from the analysis of 141,779,504 base pairs of human genomic sequence in GenBank. This data is available on our Web site (pompous.swmed.edu) and will be updated periodically as GenBank is expanded and algorithm accuracy is improved.Keywords
This publication has 37 references indexed in Scilit:
- Statistics of local complexity in amino acid sequences and sequence databasesPublished by Elsevier ,2001
- NIH Launches the Final Push to Sequence the GenomeScience, 1996
- Frequency and Polymorphism of Simple Sequence Repeats in a Contiguous 685-kb DNA Sequence Containing the Human T-Cell Receptor β-Chain Gene ComplexGenomics, 1995
- Sequence variability of a prolonged tetranucleotide repeatHuman Molecular Genetics, 1995
- MATS: a rapid and efficient method for the development of microsatellite markers from YACsGenomics, 1995
- Selective isolation of highly polymorphic (dC−dA)n · (dG−dT)n microsatellites by stringent hybridizationGenetic Analysis: Biomolecular Engineering, 1993
- Identification of a CA repeat at the TCRA locus using yeast artificial chromosomes: A general method for generating highly polymorphic markers at chosen lociGenomics, 1992
- Biology and applications of human minisatellite lociCurrent Opinion in Genetics & Development, 1992
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990
- Basic local alignment search toolJournal of Molecular Biology, 1990