Efficient Calculation of Interval Scores for DNA Copy Number Data Analysis
- 1 March 2006
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 13 (2) , 215-228
- https://doi.org/10.1089/cmb.2006.13.215
Abstract
DNA amplifications and deletions characterize cancer genome and are often related to disease evolution. Microarray-based techniques for measuring these DNA copy-number changes use fluorescence ratios at arrayed DNA elements (BACs, cDNA, or oligonucleotides) to provide signals at high resolution, in terms of genomic locations. These data are then further analyzed to map aberrations and boundaries and identify biologically significant structures. We develop a statistical framework that enables the casting of several DNA copy number data analysis questions as optimization problems over real-valued vectors of signals. The simplest form of the optimization problem seeks to maximize phi(I) = Sigmanu(i)/radical|I| over all subintervals I in the input vector. We present and prove a linear time approximation scheme for this problem, namely, a process with time complexity O (nepsilon(-2)) that outputs an interval for which phi(I) is at least Opt/alpha(epsilon), where Opt is the actual optimum and alpha(epsilon) --> 1 as epsilon --> 0. We further develop practical implementations that improve the performance of the naive quadratic approach by orders of magnitude. We discuss properties of optimal intervals and how they apply to the algorithm performance. We benchmark our algorithms on synthetic as well as publicly available DNA copy number data. We demonstrate the use of these methods for identifying aberrations in single samples as well as common alterations in fixed sets and subsets of breast cancer samples.Keywords
This publication has 11 references indexed in Scilit:
- Genetic analysis of genome-wide variation in human gene expressionNature, 2004
- High-Resolution Global Profiling of Genomic Alterations with Long Oligonucleotide MicroarrayCancer Research, 2004
- High-Resolution Analysis of DNA Copy Number Using Oligonucleotide MicroarraysGenome Research, 2004
- Molecular classification of familial non- BRCA1/BRCA2 breast cancerProceedings of the National Academy of Sciences, 2003
- Chromosomal imbalances in human lung cancerOncogene, 2002
- Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumorsProceedings of the National Academy of Sciences, 2002
- Adaptive Weights Smoothing with Applications to Image RestorationJournal of the Royal Statistical Society Series B: Statistical Methodology, 2000
- Genome-wide analysis of DNA copy-number changes using cDNA microarraysNature Genetics, 1999
- High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarraysNature Genetics, 1998
- A survey on image segmentationPattern Recognition, 1981