A database and API for variation, dense genotyping and resequencing data
Open Access
- 11 May 2010
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 11 (1) , 238
- https://doi.org/10.1186/1471-2105-11-238
Abstract
Background: Advances in sequencing and genotyping technologies are leading to the widespread availability of multi-species variation data, dense genotype data and large-scale resequencing projects. The 1000 Genomes Project and similar efforts in other species are challenging the methods previously used for storage and manipulation of such data necessitating the redesign of existing genome-wide bioinformatics resources. Results: Ensembl has created a database and software library to support data storage, analysis and access to the existing and emerging variation data from large mammalian and vertebrate genomes. These tools scale to thousands of individual genome sequences and are integrated into the Ensembl infrastructure for genome annotation and visualisation. The database and software system is easily expanded to integrate both public and non-public data sources in the context of an Ensembl software installation and is already being used outside of the Ensembl project in a number of database and application environments. Conclusions: Ensembl's powerful, flexible and open source infrastructure for the management of variation, genotyping and resequencing data is freely available at http://www.ensembl.org.Keywords
This publication has 24 references indexed in Scilit:
- Gramene: a growing plant comparative genomics resourceNucleic Acids Research, 2007
- A second generation human haplotype map of over 3.1 million SNPsNature, 2007
- Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controlsNature, 2007
- Variation resources at UC Santa CruzNucleic Acids Research, 2006
- TranscriptSNPView: a genome-wide catalog of mouse coding variationNature Genetics, 2006
- Finishing the euchromatic sequence of the human genomeNature, 2004
- Haploview: analysis and visualization of LD and haplotype mapsBioinformatics, 2004
- The UCSC Table Browser data retrieval toolNucleic Acids Research, 2004
- A map of human genome sequence variation containing 1.42 million single nucleotide polymorphismsNature, 2001
- Initial sequencing and analysis of the human genomeNature, 2001