Finding function: evaluation methods for functional genomic data
Open Access
- 25 July 2006
- journal article
- Published by Springer Nature in BMC Genomics
- Vol. 7 (1) , 187
- https://doi.org/10.1186/1471-2164-7-187
Abstract
Accurate evaluation of the quality of genomic or proteomic data and computational methods is vital to our ability to use them for formulating novel biological hypotheses and directing further experiments. There is currently no standard approach to evaluation in functional genomics. Our analysis of existing approaches shows that they are inconsistent and contain substantial functional biases that render the resulting evaluations misleading both quantitatively and qualitatively. These problems make it essentially impossible to compare computational methods or large-scale experimental datasets and also result in conclusions that generalize poorly in most biological applications. We reveal issues with current evaluation methods here and suggest new approaches to evaluation that facilitate accurate and representative characterization of genomic methods and data. Specifically, we describe a functional genomics gold standard based on curation by expert biologists and demonstrate its use as an effective means of evaluation of genomic approaches. Our evaluation framework and gold standard are freely available to the community through our website. Proper methods for evaluating genomic data and computational approaches will determine how much we, as a community, are able to learn from the wealth of available data. We propose one possible solution to this problem here but emphasize that this topic warrants broader community discussion.Keywords
This publication has 37 references indexed in Scilit:
- Hierarchical multi-label prediction of gene functionBioinformatics, 2006
- Discovering functional relationships: biochemistry versus geneticsTrends in Genetics, 2005
- Kernel methods for predicting protein-protein interactionsBioinformatics, 2005
- A Probabilistic Functional Network of Yeast GenesScience, 2004
- Information assessment on predicting protein-protein interactionsBMC Bioinformatics, 2004
- KERNEL-BASED DATA FUSION AND ITS APPLICATION TO PROTEIN FUNCTION PREDICTION IN YEASTPacific Symposium on Biocomputing, 2003
- A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic DataScience, 2003
- Predicting gene function in Saccharomyces cerevisiaeBioinformatics, 2003
- A Bayesian framework for combining heterogeneous data sources for gene function prediction (inSaccharomyces cerevisiae)Proceedings of the National Academy of Sciences, 2003
- Learning Gene Functional Classifications from Multiple Data TypesJournal of Computational Biology, 2002