Exploiting noise in array CGH data to improve detection of DNA copy number change
Open Access
- 30 January 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 35 (5) , e35
- https://doi.org/10.1093/nar/gkl730
Abstract
Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the aberration regions and the boundary break points very accurately. Furthermore, when evaluating the accuracy of an algorithm for analyzing array-CGH data, it is commonly assumed that noise in the data follows normal distribution. A fundamental question is whether noise in array-CGH is indeed Gaussian, and if not, can one exploit the characteristics of noise to develop novel analysis methods that are capable of detecting accurately the aberration regions as well as the boundary break points simultaneously? By analyzing bacterial artificial chromosomes (BACs) arrays with an average 1 mb resolution, 19 k oligo arrays with the average probe spacing <100 kb and 385 k oligo arrays with the average probe spacing of about 6 kb, we show that when there are aberrations, noise in all three types of arrays is highly non-Gaussian and possesses long-range spatial correlations, and that such noise leads to worse performance of existing methods for detecting aberrations in array-CGH than the Gaussian noise case. We further develop a novel method, which has optimally exploited the character of the noise, and is capable of identifying both aberration regions as well as the boundary break points very accurately. Finally, we propose a new concept, posteriori signal-to-noise ratio (p-SNR), to assign certain confidence level to an aberration region and boundaries detected.Keywords
This publication has 28 references indexed in Scilit:
- Inertia and memory in ambiguous visual perceptionCognitive Processing, 2006
- Assessment of long-range correlation in time series: How to avoid pitfallsPhysical Review E, 2006
- A high-resolution survey of deletion polymorphism in the human genomeNature Genetics, 2005
- Array comparative genomic hybridization and its applications in cancerNature Genetics, 2005
- A versatile statistical analysis algorithm to detect genome copy number variationProceedings of the National Academy of Sciences, 2004
- Breakpoint identification and smoothing of array comparative genomic hybridization dataBioinformatics, 2004
- Shaping of tumor and drug-resistant genomes by instability and selectionOncogene, 2003
- Genome scanning with array CGH delineates regional alterations in mouse islet carcinomasNature Genetics, 2001
- High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarraysNature Genetics, 1998
- Matrix-based comparative genomic hybridization: biochips to screen for genomic imbalances.1997