Automated Gene Ontology annotation for anonymous sequence data
Open Access
- 1 July 2003
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 31 (13) , 3712-3715
- https://doi.org/10.1093/nar/gkg582
Abstract
Gene Ontology (GO) is the most widely accepted attempt to construct a unified and structured vocabulary for the description of genes and their products in any organism. Annotation by GO terms is performed in most of the current genome projects, which besides generality has the advantage of being very convenient for computer based classification methods. However, direct use of GO in small sequencing projects is not easy, especially for species not commonly represented in public databases. We present a software package (GOblet), which performs annotation based on GO terms for anonymous cDNA or protein sequences. It uses the species independent GO structure and vocabulary together with a series of protein databases collected from various sites, to perform a detailed GO annotation by sequence similarity searches. The sensitivity and the reference protein sets can be selected by the user. GOblet runs automatically and is available as a public service on our web server. The paper also addresses the reliability of automated GO annotations by using a reference set of more than 6000 human proteins. The GOblet server is accessible at http://goblet.molgen.mpg.de.Keywords
This publication has 19 references indexed in Scilit:
- Large-Scale Protein Annotation through Gene OntologyGenome Research, 2002
- Predicting Gene Ontology Functions from ProDom and CDD Protein DomainsGenome Research, 2002
- Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO)Nucleic Acids Research, 2002
- FANTOM DB: database of Functional Annotation of RIKEN Mouse cDNA ClonesNucleic Acids Research, 2002
- Creating the Gene Ontology Resource: Design and ImplementationGenome Research, 2001
- PROGRAM DESCRIPTIONGenomics, 2001
- The Sequence of the Human GenomeScience, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- MPBLAST : improved BLAST performance with multiplexed queriesBioinformatics, 2000
- Flexible Sequence Similarity Searching with the FASTA3 Program PackagePublished by Springer Nature ,1999