Selecting Differentially Expressed Genes from Microarray Experiments
- 24 March 2003
- journal article
- Published by Oxford University Press (OUP) in Biometrics
- Vol. 59 (1) , 133-142
- https://doi.org/10.1111/1541-0420.00016
Abstract
Summary. High throughput technologies, such as gene expression arrays and protein mass spectrometry, allow one to simultaneously evaluate thousands of potential biomarkers that could distinguish different tissue types. Of particular interest here is distinguishing between cancerous and normal organ tissues. We consider statistical methods to rank genes (or proteins) in regards to differential expression between tissues. Various statistical measures are considered, and we argue that two measures related to the Receiver Operating Characteristic Curve are particularly suitable for this purpose. We also propose that sampling variability in the gene rankings be quantified, and suggest using the “selection probability function,” the probability distribution of rankings for each gene. This is estimated via the bootstrap. A real dataset, derived from gene expression arrays of 23 normal and 30 ovarian cancer tissues, is analyzed. Simulation studies are also used to assess the relative performance of different statistical gene ranking measures and our quantification of sampling variability. Our approach leads naturally to a procedure for sample‐size calculations, appropriate for exploratory studies that seek to identify differentially expressed genes.Keywords
This publication has 10 references indexed in Scilit:
- Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression DataJournal of the American Statistical Association, 2002
- Bootstrapping cluster analysis: Assessing the reliability of conclusions from microarray experimentsProceedings of the National Academy of Sciences, 2001
- Phases of Biomarker Development for Early Detection of CancerJNCI Journal of the National Cancer Institute, 2001
- Immunogenicity of recombinant GA733-2E antigen (CO17-1A, EGP, KS1-4, KSA, Ep-CAM) in gastro-intestinal carcinoma patientsInternational Journal of Cancer, 2001
- Receiver Operating Characteristic MethodologyJournal of the American Statistical Association, 2000
- Conserved expression of hepatocyte growth factor activator inhibitor type-2/placental bikunin in human colorectal carcinomasCancer Letters, 2000
- Violin Plots: A Box Plot-Density Trace SynergismThe American Statistician, 1998
- An Introduction to the BootstrapPublished by Springer Nature ,1993
- Analyzing a Portion of the ROC CurveMedical Decision Making, 1989
- The area above the ordinal dominance graph and the area below the receiver operating characteristic graphJournal of Mathematical Psychology, 1975