SMART, a simple modular architecture research tool: Identification of signaling domains

Abstract
Accurate multiple alignments of 86 domains that occur in signaling proteins have been constructed and used to provide a Web-based tool (SMART: simple modular architecture research tool) that allows rapid identification and annotation of signaling domain sequences. The majority of signaling proteins are multidomain in character with a considerable variety of domain combinations known. Comparison with established databases showed that 25% of our domain set could not be deduced from SwissProt and 41% could not be annotated by Pfam. SMART is able to determine the modular architectures of single sequences or genomes; application to the entire yeast genome revealed that at least 6.7% of its genes contain one or more signaling domains, approximately 350 greater than previously annotated. The process of constructing SMART predicted ( i ) novel domain homologues in unexpected locations such as band 4.1-homologous domains in focal adhesion kinases; ( ii ) previously unknown domain families, including a citron-homology domain; ( iii ) putative functions of domain families after identification of additional family members, for example, a ubiquitin-binding role for ubiquitin-associated domains (UBA); ( iv ) cellular roles for proteins, such predicted DEATH domains in netrin receptors further implicating these molecules in axonal guidance; ( v ) signaling domains in known disease genes such as SPRY domains in both marenostrin/pyrin and Midline 1; ( vi ) domains in unexpected phylogenetic contexts such as diacylglycerol kinase homologues in yeast and bacteria; and ( vii ) likely protein misclassifications exemplified by a predicted pleckstrin homology domain in a Candida albicans protein, previously described as an integrin.