Physical map-assisted whole-genome shotgun sequence assemblies
Open Access
- 1 June 2006
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 16 (6) , 768-775
- https://doi.org/10.1101/gr.5090606
Abstract
We describe a targeted approach to improve the contiguity of whole-genome shotgun sequence (WGS) assemblies at run-time, using information from Bacterial Artificial Chromosome (BAC)-based physical maps. Clone sizes and overlaps derived from clone fingerprints are used for the calculation of length constraints between any two BAC neighbors sharing 40% of their size. These constraints are used to promote the linkage and guide the arrangement of sequence contigs within a sequence scaffold at the layout phase of WGS assemblies. This process is facilitated by FASSI, a stand-alone application that calculates BAC end and BAC overlap length constraints from clone fingerprint map contigs created by the FPC package. FASSI is designed to work with the assembly tool PCAP, but its output can be formatted to work with other WGS assembly algorithms able to use length constraints for individual clones. The FASSI method is simple to implement, potentially cost-effective, and has resulted in the increase of scaffold contiguity for both the Drosophila melanogaster and Cryptococcus gattii genomes when compared to a control assembly without map-derived constraints. A 6.5-fold coverage draft DNA sequence of the Pan troglodytes (chimpanzee) genome was assembled using map-derived constraints and resulted in a 26.1% increase in scaffold contiguity.Keywords
This publication has 35 references indexed in Scilit:
- Three's companyNature, 2004
- The Atlas Genome Assembly SystemGenome Research, 2004
- Software for Automated Analysis of DNA Fingerprinting GelsGenome Research, 2003
- The Phusion AssemblerGenome Research, 2002
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- The Genome Sequence of the Malaria Mosquito Anopheles gambiaeScience, 2002
- Initial sequencing and analysis of the human genomeNature, 2001
- A Whole-Genome Assembly of DrosophilaScience, 2000
- The Genome Sequence of Drosophila melanogasterScience, 2000
- Software for genome mapping by fingerprinting techniquesBioinformatics, 1988