Modeling Read Counts for CNV Detection in Exome Sequencing Data
- 8 January 2011
- journal article
- research article
- Published by Walter de Gruyter GmbH in Statistical Applications in Genetics and Molecular Biology
- Vol. 10 (1)
- https://doi.org/10.2202/1544-6115.1732
Abstract
Varying depth of high-throughput sequencing reads along a chromosome makes it possible to observe copy number variants (CNVs) in a sample relative to a reference. In exome and other targeted sequencing projects, technical factors increase variation in read depth while reducing the number of observed locations, adding difficulty to the problem of identifying CNVs. We present a hidden Markov model for detecting CNVs from raw read count data, using background read depth from a control set as well as other positional covariates such as GC-content. The model, exomeCopy, is applied to a large chromosome X exome sequencing project identifying a list of large unique CNVs. CNVs predicted by the model and experimentally validated are then recovered using a cross-platform control set from publicly available exome sequencing data. Simulations show high sensitivity for detecting heterozygous and homozygous CNVs, outperforming normalization and state-of-the-art segmentation methods.Keywords
This publication has 36 references indexed in Scilit:
- Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutationsNature Genetics, 2011
- A map of human genome variation from population-scale sequencingNature, 2010
- CNAseg—a novel framework for identification of copy number changes in cancer from second-generation sequencing dataBioinformatics, 2010
- edgeR: a Bioconductor package for differential expression analysis of digital gene expression dataBioinformatics, 2009
- Origins and functional impact of copy number variation in the human genomeNature, 2009
- Personalized copy number and segmental duplication maps using next-generation sequencingNature Genetics, 2009
- Filter-based hybridization capture of subgenomes enables resequencing and copy-number detectionNature Methods, 2009
- Autism genome-wide copy number variation reveals ubiquitin and neuronal genesNature, 2009
- High-resolution mapping of copy-number alterations with massively parallel sequencingNature Methods, 2008
- Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencingNature Genetics, 2008