Sparse representation and Bayesian detection of genome copy number alterations from microarray data
Open Access
- 1 February 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 24 (3) , 309-318
- https://doi.org/10.1093/bioinformatics/btm601
Abstract
Motivation: Genomic instability in cancer leads to abnormal genome copy number alterations (CNA) that are associated with the development and behavior of tumors. Advances in microarray technology have allowed for greater resolution in detection of DNA copy number changes (amplifications or deletions) across the genome. However, the increase in number of measured signals and accompanying noise from the array probes present a challenge in accurate and fast identification of breakpoints that define CNA. This article proposes a novel detection technique that exploits the use of piece wise constant (PWC) vectors to represent genome copy number and sparse Bayesian learning (SBL) to detect CNA breakpoints. Methods: First, a compact linear algebra representation for the genome copy number is developed from normalized probe intensities. Second, SBL is applied and optimized to infer locations where copy number changes occur. Third, a backward elimination (BE) procedure is used to rank the inferred breakpoints; and a cut-off point can be efficiently adjusted in this procedure to control for the false discovery rate (FDR). Results: The performance of our algorithm is evaluated using simulated and real genome datasets and compared to other existing techniques. Our approach achieves the highest accuracy and lowest FDR while improving computational speed by several orders of magnitude. The proposed algorithm has been developed into a free standing software application (GADA, Genome Alteration Detection Algorithm). Availability:http://biron.usc.edu/~piquereg/GADA Contact:shahab@chla.usc.edu and rpique@ieee.org Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 37 references indexed in Scilit:
- Flexible and Accurate Detection of Genomic Copy-Number Changes from aCGHPLoS Computational Biology, 2007
- Global variation in copy number in the human genomeNature, 2006
- Genome-wide detection of human copy number variations using high-density DNA oligonucleotide arraysGenome Research, 2006
- High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotypingGenome Research, 2006
- Detection of DNA copy number alterations using penalized least squares regressionBioinformatics, 2005
- Probable equivalence, superpower sets, and superconditionalsInternational Journal of Intelligent Systems, 2004
- Analysis of array CGH data: from signal ratio to gain and loss of DNA regionsBioinformatics, 2004
- Wavelet footprints: theory, algorithms, and applicationsIEEE Transactions on Signal Processing, 2003
- Wrappers for feature subset selectionArtificial Intelligence, 1997
- Matching pursuits with time-frequency dictionariesIEEE Transactions on Signal Processing, 1993