A Top-Performing Algorithm for the DREAM3 Gene Expression Prediction Challenge
Open Access
- 4 February 2010
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 5 (2) , e8944
- https://doi.org/10.1371/journal.pone.0008944
Abstract
A wealth of computational methods has been developed to address problems in systems biology, such as modeling gene expression. However, to objectively evaluate and compare such methods is notoriously difficult. The DREAM (Dialogue on Reverse Engineering Assessments and Methods) project is a community-wide effort to assess the relative strengths and weaknesses of different computational methods for a set of core problems in systems biology. This article presents a top-performing algorithm for one of the challenge problems in the third annual DREAM (DREAM3), namely the gene expression prediction challenge. In this challenge, participants are asked to predict the expression levels of a small set of genes in a yeast deletion strain, given the expression levels of all other genes in the same strain and complete gene expression data for several other yeast strains. I propose a simple -nearest-neighbor (KNN) method to solve this problem. Despite its simplicity, this method works well for this challenge, sharing the “top performer” honor with a much more sophisticated method. I also describe several alternative, simple strategies, including a modified KNN algorithm that further improves the performance of the standard KNN method. The success of these methods suggests that complex methods attempting to integrate multiple data sets do not necessarily lead to better performance than simple yet robust methods. Furthermore, none of these top-performing methods, including the one by a different team, are based on gene regulatory networks, which seems to suggest that accurately modeling gene expression using gene regulatory networks is unfortunately still a difficult task.Keywords
This publication has 25 references indexed in Scilit:
- Gene Expression Prediction by Soft Integration and the Elastic Net—Best Performance of the DREAM3 Gene Expression ChallengePLOS ONE, 2010
- An ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression dataBMC Genomics, 2009
- Predicting expression patterns from regulatory sequence in Drosophila segmentationNature, 2008
- From in vivo to in silico biology and backNature, 2006
- Advanced computing for systems biologyBriefings in Bioinformatics, 2006
- Computational aspects of systematic biologyBriefings in Bioinformatics, 2006
- A bi-dimensional regression tree approach to the modeling of gene expression regulationBioinformatics, 2005
- Module networks: identifying regulatory modules and their condition-specific regulators from gene expression dataNature Genetics, 2003
- Using Bayesian Networks to Analyze Expression DataJournal of Computational Biology, 2000
- Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detectionStatistics & Probability Letters, 1997