Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library
- 1 January 2010
- journal article
- research article
- Published by Springer Nature in Nature Biotechnology
- Vol. 28 (1) , 47-55
- https://doi.org/10.1038/nbt.1600
Abstract
Sequencing a person's genome may reveal large DNA insertions and other structural rearrangements, but assessing their effects requires pinpointing them to nucleotide resolution. Lam et al. use a library of previously discovered rearrangements to map and analyze genetic variation. Structural variants (SVs) are a major source of human genomic variation; however, characterizing them at nucleotide resolution remains challenging. Here we assemble a library of breakpoints at nucleotide resolution from collating and standardizing ~2,000 published SVs. For each breakpoint, we infer its ancestral state (through comparison to primate genomes) and its mechanism of formation (e.g., nonallelic homologous recombination, NAHR). We characterize breakpoint sequences with respect to genomic landmarks, chromosomal location, sequence motifs and physical properties, finding that the occurrence of insertions and deletions is more balanced than previously reported and that NAHR-formed breakpoints are associated with relatively rigid, stable DNA helices. Finally, we demonstrate an approach, BreakSeq, for scanning the reads from short-read sequenced genomes against our breakpoint library to accurately identify previously overlooked SVs, which we then validate by PCR. As new data become available, we expect our BreakSeq approach will become more sensitive and facilitate rapid SV genotyping of personal genomes.Keywords
This publication has 50 references indexed in Scilit:
- Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's diseaseNature Genetics, 2008
- Mapping and sequencing of structural variation from eight human genomesNature, 2008
- Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencingNature Genetics, 2008
- The complete genome of an individual by massively parallel DNA sequencingNature, 2008
- The Fine-Scale and Complex Architecture of Human Copy-Number VariationAmerican Journal of Human Genetics, 2008
- Germline rates of de novo meiotic deletions and duplications causing several genomic disordersNature Genetics, 2007
- Global variation in copy number in the human genomeNature, 2006
- Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humansNature, 2006
- Fine-scale structural variation of the human genomeNature Genetics, 2005
- Initial sequencing and analysis of the human genomeNature, 2001