Predicting functional regulatory polymorphisms
Open Access
- 18 June 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 24 (16) , 1787-1792
- https://doi.org/10.1093/bioinformatics/btn311
Abstract
Motivation: Limited availability of data has hindered the development of algorithms that can identify functionally meaningful regulatory single nucleotide polymorphisms (rSNPs). Given the large number of common polymorphisms known to reside in the human genome, the identification of functional rSNPs via laboratory assays will be costly and time-consuming. Therefore appropriate bioinformatics strategies for predicting functional rSNPs are necessary. Recent data from the Encyclopedia of DNA Elements (ENCODE) Project has significantly expanded the amount of available functional information relevant to non-coding regions of the genome, and, importantly, led to the conclusion that many functional elements in the human genome are not conserved. Results: In this article we describe how ENCODE data can be leveraged to probabilistically determine the functional and phenotypic significance of non-coding SNPs (ncSNPs). The method achieves excellent sensitivity (∼80%) and specificity (∼99%) based on a set of known phenotypically relevant and non-functional SNPs. In addition, we show that our method is not overtrained through the use of cross-validation analyses. Availability: The software platforms used in our analyses are freely available (http://www.cs.waikato.ac.nz/ml/weka/). In addition, we provide the training dataset ( Supplementary Table 3 ), and our predictions ( Supplementary Table 6 ), in the Supplementary Material . Contact:nschork@scripps.edu. Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 34 references indexed in Scilit:
- Accommodating Linkage Disequilibrium in Genetic-Association Analyses via Ridge RegressionAmerican Journal of Human Genetics, 2008
- Shifting Paradigm of Association Studies: Value of Rare Single-Nucleotide PolymorphismsAmerican Journal of Human Genetics, 2008
- Ensembl 2008Nucleic Acids Research, 2007
- The UCSC Genome Browser Database: 2008 updateNucleic Acids Research, 2007
- Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot projectNature, 2007
- The International HapMap ProjectNature, 2003
- Human Gene Mutation Database (HGMD®): 2003 updateHuman Mutation, 2003
- The insulation of genes from external enhancers and silencing chromatinProceedings of the National Academy of Sciences, 2002
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitationNature Biotechnology, 1998