Matching in Studies of Classification Accuracy: Implications for Analysis, Efficiency, and Assessment of Incremental Value
- 26 February 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Biometrics
- Vol. 64 (1) , 1-9
- https://doi.org/10.1111/j.1541-0420.2007.00823.x
Abstract
Summary. In case–control studies evaluating the classification accuracy of a marker, controls are often matched to cases with respect to factors associated with the marker and disease status. In contrast with matching in epidemiologic etiology studies, matching in the classification setting has not been rigorously studied. In this article, we consider the implications of matching in terms of the choice of statistical analysis, efficiency, and assessment of the incremental value of the marker over the matching covariates. We find that adjustment for the matching covariates is essential, as unadjusted summaries of classification accuracy can be biased. In many settings, matching is the most efficient covariate-dependent sampling scheme, and we provide an expression for the optimal matching ratio. However, we also show that matching greatly complicates estimation of the incremental value of the marker. We recommend that matching be carefully considered in the context of these findings.Keywords
This publication has 29 references indexed in Scilit:
- The optimal ratio of cases to controls for estimating the classification accuracy of a biomarkerBiostatistics, 2005
- The association of body mass index and prostate‐specific antigen in a population‐based studyCancer, 2005
- Quantification of Free Circulating DNA As a Diagnostic Marker in Lung CancerJournal of Clinical Oncology, 2003
- Adjusting receiver operating characteristic curves and related indices for covariatesJournal of the Royal Statistical Society: Series D (The Statistician), 2003
- Discriminant analysis through a semiparametric modelBiometrika, 2003
- Absent nasal bone in the prenatal detection of fetuses with trisomy 21 in a high-risk populationPublished by Wolters Kluwer Health ,2003
- Semiparametric Receiver Operating Characteristic Analysis to Evaluate Biomarkers for DiseaseJournal of the American Statistical Association, 2002
- Semiparametric Estimation of Regression Quantiles with Application to Standardizing Weight for Height and Age in US ChildrenJournal of the Royal Statistical Society Series C: Applied Statistics, 1999
- The Robustness of the "Binormal" Assumptions Used in Fitting ROC CurvesMedical Decision Making, 1988
- Indices of discrimination or diagnostic accuracy: Their ROCs and implied models.Psychological Bulletin, 1986