EVEREST: a collection of evolutionary conserved protein domains

Open Access

11 November 2006

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 35 (Database) , D241-D246
https://doi.org/10.1093/nar/gkl850

Abstract

Protein domains are subunits of proteins that recur throughout the protein world. There are many definitions attempting to capture the essence of a protein domain, and several systems that identify protein domains and classify them into families. EVEREST, recently described in Portugaly et al. (2006) BMC Bioinformatics, 7, 277, is one such system that performs the task automatically, using protein sequence alone. Herein we describe EVEREST release 2.0, consisting of 20 029 families, each defined by one or more HMMs. The current EVEREST database was constructed by scanning UniProt 8.1 and all PDB sequences (total over 3 000 000 sequences) with each of the EVEREST families. EVEREST annotates 64% of all sequences, and covers 59% of all residues. EVEREST is available at Author Webpage. The website provides annotations given by SCOP, CATH, Pfam A and EVEREST. It allows for browsing through the families of each of those sources, graphically visualizing the domain organization of the proteins in the family. The website also provides access to analyzes of relationships between domain families, within and across domain definition systems. Users can upload sequences for analysis by the set of EVEREST families. Finally an advanced search form allows querying for families matching criteria regarding novelty, phylogenetic composition and more.

Keywords

This publication has 16 references indexed in Scilit:

Pfam: clans, web tools and services
Nucleic Acids Research, 2006
InterPro, progress and status in 2005
Nucleic Acids Research, 2004
Domains, motifs and clusters in the protein universe
Current Opinion in Chemical Biology, 2003
The Pfam Protein Families Database
Nucleic Acids Research, 2002
ASTRAL compendium enhancements
Nucleic Acids Research, 2002
ProDom: Automated clustering of homologous domains
Briefings in Bioinformatics, 2002
The Protein Data Bank
Nucleic Acids Research, 2000
SMART: a web-based tool for the study of genetically mobile domains
Nucleic Acids Research, 2000
Database resources of the National Center for Biotechnology Information
Nucleic Acids Research, 2000
Comparative methods for identifying functional domains in protein sequences
Biotechnology Annual Review, 1995