Shotgun sequence assembly and recent segmental duplications within the human genome
- 21 October 2004
- journal article
- research article
- Published by Springer Nature in Nature
- Vol. 431 (7011) , 927-930
- https://doi.org/10.1038/nature03062
Abstract
Complex eukaryotic genomes are now being sequenced at an accelerated pace primarily using whole-genome shotgun (WGS) sequence assembly approaches. WGS assembly was initially criticized because of its perceived inability to resolve repeat structures within genomes. Here, we quantify the effect of WGS sequence assembly on large, highly similar repeats by comparison of the segmental duplication content of two different human genome assemblies. Our analysis shows that large (> 15 kilobases) and highly identical (> 97%) duplications are not adequately resolved by WGS assembly. This leads to significant reduction in genome length and the loss of genes embedded within duplications. Comparable analyses of mouse genome assemblies confirm that strict WGS sequence assembly will oversimplify our understanding of mammalian genome structure and evolution; a hybrid strategy using a targeted clone-by-clone approach to resolve duplications is proposed.Keywords
This publication has 24 references indexed in Scilit:
- Finishing the euchromatic sequence of the human genomeNature, 2004
- Analysis of Segmental Duplications and Genome Assembly in the MouseGenome Research, 2004
- Recent Segmental Duplications in the Working Draft Assembly of the Brown Norway RatGenome Research, 2004
- Genome sequence of the Brown Norway rat yields insights into mammalian evolutionNature, 2004
- Enrichment of segmental duplications in regions of breaks of synteny between the human and mouse genomes suggest their involvement in evolutionary rearrangementsHuman Molecular Genetics, 2003
- The DNA sequence of human chromosome 7Nature, 2003
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- Integration of cytogenetic landmarks into the draft sequence of the human genomeNature, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- The Genome Sequence of Drosophila melanogasterScience, 2000