Proteome Analyst: custom predictions with explanations in a web-based tool for high-throughput proteome annotations
Open Access
- 1 July 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 32 (Web Server) , W365-W371
- https://doi.org/10.1093/nar/gkh485
Abstract
Proteome Analyst (PA) (http://www.cs.ualberta.ca/~bioinfo/PA/) is a publicly available, high-throughput, web-based system for predicting various properties of each protein in an entire proteome. Using machine-learned classifiers, PA can predict, for example, the GeneQuiz general function and Gene Ontology (GO) molecular function of a protein. In addition, PA is currently the most accurate and most comprehensive system for predicting subcellular localization, the location within a cell where a protein performs its main function. Two other capabilities of PA are notable. First, PA can create a custom classifier to predict a new property, without requiring any programming, based on labeled training data (i.e. a set of examples, each with the correct classification label) provided by a user. PA has been used to create custom classifiers for potassium-ion channel proteins and other general function ontologies. Second, PA provides a sophisticated explanation feature that shows why one prediction is chosen over another. The PA system produces a Naïve Bayes classifier, which is amenable to a graphical and interactive approach to explanations for its predictions; transparent predictions increase the user's confidence in, and understanding of, PA.Keywords
This publication has 9 references indexed in Scilit:
- Predicting subcellular localization of proteins using machine-learned classifiersBioinformatics, 2004
- Functional annotation of proteomic sequences based on consensus of sequence and structural analysisBriefings in Bioinformatics, 2002
- The Ensembl genome database projectNucleic Acids Research, 2002
- Functional and structural genomics using PEDANTBioinformatics, 2001
- The InterPro database, an integrated documentation resource for protein families, domains and functional sitesNucleic Acids Research, 2001
- Automated genome sequence analysis and annotation.Bioinformatics, 1999
- The GAIA software framework for genome annotation.1998
- Genotator: A Workbench for Sequence AnnotationGenome Research, 1997
- MAGPIE: automated genome interpretationTrends in Genetics, 1996