Integrating genome assemblies with MAIA
Open Access
- 4 September 2010
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 26 (18) , i433-i439
- https://doi.org/10.1093/bioinformatics/btq366
Abstract
Motivation: De novo assembly of a eukaryotic genome with next-generation sequencing data is still a challenging task. Over the past few years several assemblers have been developed, often suitable for one specific type of sequencing data. The number of known genomes is expanding rapidly, therefore it becomes possible to use multiple reference genomes for assembly projects. We introduce an assembly integrator that makes use of all available data, i.e. multiple de novo assemblies and mappings against multiple related genomes, by optimizing a weighted combination of criteria. Results: The developed algorithm was applied on the de novo sequencing of the Saccharomyces cerevisiae CEN.PK 113-7D strain. Using Solexa and 454 read data, two de novo and three comparative assemblies were constructed and subsequently integrated, yielding 29 contigs, covering more than 12 Mbp; a drastic improvement compared with the single assemblies. Availability: MAIA is available as a Matlab package and can be downloaded from http://bioinformatics.tudelft.nl Contact: j.f.nijkamp@tudelft.nlKeywords
This publication has 24 references indexed in Scilit:
- Genome structure of a Saccharomyces cerevisiae strain widely used in bioethanol productionGenome Research, 2009
- De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence dataGenome Biology, 2009
- Fast and accurate short read alignment with Burrows–Wheeler transformBioinformatics, 2009
- Mapping short DNA sequencing reads and calling variants using mapping quality scoresGenome Research, 2008
- Combinatorial influence of environmental parameters on transcription factor activityBioinformatics, 2008
- Extending assembly of short DNA sequences to handle errorBioinformatics, 2007
- A Sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomesProceedings of the National Academy of Sciences, 2006
- A data integration methodology for systems biologyProceedings of the National Academy of Sciences, 2005
- Fast algorithms for large-scale genome alignment and comparisonNucleic Acids Research, 2002
- Future paths for integer programming and links to artificial intelligenceComputers & Operations Research, 1986