SNiPer-HD: improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays
Open Access
- 24 October 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 23 (1) , 57-63
- https://doi.org/10.1093/bioinformatics/btl536
Abstract
Motivation: The technology to genotype single nucleotide polymorphisms (SNPs) at extremely high densities provides for hypothesis-free genome-wide scans for common polymorphisms associated with complex disease. However, we find that some errors introduced by commonly employed genotyping algorithms may lead to inflation of false associations between markers and phenotype. Results: We have developed a novel SNP genotype calling program, SNiPer-High Density (SNiPer-HD), for highly accurate genotype calling across hundreds of thousands of SNPs. The program employs an expectation-maximization (EM) algorithm with parameters based on a training sample set. The algorithm choice allows for highly accurate genotyping for most SNPs. Also, we introduce a quality control metric for each assayed SNP, such that poor-behaving SNPs can be filtered using a metric correlating to genotype class separation in the calling algorithm. SNiPer-HD is superior to the standard dynamic modeling algorithm and is complementary and non-redundant to other algorithms, such as BRLMM. Implementing multiple algorithms together may provide highly accurate genotyping calls, without inflation of false positives due to systematically miss-called SNPs. A reliable and accurate set of SNP genotypes for increasingly dense panels will eliminate some false association signals and false negative signals, allowing for rapid identification of disease susceptibility loci for complex traits. Availability: SNiPer-HD is available at TGen's website: . Contact:dstephan@tgen.orgKeywords
This publication has 13 references indexed in Scilit:
- A haplotype map of the human genomeNature, 2005
- The International HapMap Project Web site: Figure 1.Genome Research, 2005
- A Note on Exact Tests of Hardy-Weinberg EquilibriumAmerican Journal of Human Genetics, 2005
- Complement Factor H Polymorphism in Age-Related Macular DegenerationScience, 2005
- Applications of whole-genome high-density SNP genotypingExpert Review of Molecular Diagnostics, 2005
- Dynamic model based algorithms for screening and genotyping over 100K SNPs on oligonucleotide microarraysBioinformatics, 2005
- Algorithms for large-scale genotyping microarraysBioinformatics, 2003
- How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster AnalysisThe Computer Journal, 1998
- Gaussian parsimonious clustering modelsPattern Recognition, 1995
- Silhouettes: A graphical aid to the interpretation and validation of cluster analysisJournal of Computational and Applied Mathematics, 1987