Conditional variable importance for random forests
Top Cited Papers
Open Access
- 11 July 2008
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 9 (1) , 307
- https://doi.org/10.1186/1471-2105-9-307
Abstract
Random forests are becoming increasingly popular in many scientific fields because they can cope with "small n large p" problems, complex interactions and even highly correlated predictor variables. Their variable importance measures have recently been suggested as screening tools for, e.g., gene expression studies. However, these variable importance measures show a bias towards correlated predictor variables.Keywords
This publication has 30 references indexed in Scilit:
- SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivationNature Genetics, 2008
- GeneSrF and varSelRF: a web-based tool and R package for gene selection and classification using random forestBMC Bioinformatics, 2007
- Unbiased Recursive Partitioning: A Conditional Inference FrameworkJournal of Computational and Graphical Statistics, 2006
- Random Forests and Adaptive Nearest NeighborsJournal of the American Statistical Association, 2006
- Correlation and Causation: A CommentPerspectives in Biology and Medicine, 2005
- Identifying SNPs predictive of phenotype using random forestsGenetic Epidemiology, 2004
- CARTscans: A Tool for Visualizing Complex ModelsJournal of Computational and Graphical Statistics, 2004
- Random Forest: A Classification and Regression Tool for Compound Classification and QSAR ModelingJournal of Chemical Information and Computer Sciences, 2003
- A Spurious Correlation Between Hospital Mortality and Complication Rates. The Importance of Severity AdjustmentJournal of Urology, 1998
- A Spurious Correlation Between Hospital Mortality and Complication RatesMedical Care, 1997