New tools and methods for direct programmatic access to the dbSNP relational database
Open Access
- 30 October 2010
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 39 (uppl_1) , D901-D907
- https://doi.org/10.1093/nar/gkq1054
Abstract
Genome-wide association studies often incorporate information from public biological databases in order to provide a biological reference for interpreting the results. The dbSNP database is an extensive source of information on single nucleotide polymorphisms (SNPs) for many different organisms, including humans. We have developed free software that will download and install a local MySQL implementation of the dbSNP relational database for a specified organism. We have also designed a system for classifying dbSNP tables in terms of common tasks we wish to accomplish using the database. For each task we have designed a small set of custom tables that facilitate task-related queries and provide entity-relationship diagrams for each task composed from the relevant dbSNP tables. In order to expose these concepts and methods to a wider audience we have developed web tools for querying the database and browsing documentation on the tables and columns to clarify the relevant relational structure. All web tools and software are freely available to the public at http://cgsmd.isi.edu/dbsnpq . Resources such as these for programmatically querying biological databases are essential for viably integrating biological information into genetic association experiments on a genome-wide scale.Keywords
This publication has 11 references indexed in Scilit:
- dbSNP in the detail and copy number complexitiesHuman Mutation, 2010
- Single nucleotide differences (SNDs) in the dbSNP database may lead to errors in genotyping and haplotyping studiesHuman Mutation, 2009
- Managing Experimental Data Using FuGEPublished by Springer Nature ,2009
- Database resources of the National Center for Biotechnology InformationNucleic Acids Research, 2009
- Ensembl's 10th yearNucleic Acids Research, 2009
- The UCSC Genome Browser database: update 2010Nucleic Acids Research, 2009
- 1000 Genomes Project Promises Closer Look at Variation in Human GenomePublished by American Medical Association (AMA) ,2008
- Systematic biological prioritization after a genome-wide association study: an application to nicotine dependenceBioinformatics, 2008
- Structured Query Language (SQL) FundamentalsCurrent Protocols in Bioinformatics, 2003
- dbSNP: the NCBI database of genetic variationNucleic Acids Research, 2001