Structure and distribution of modules in extracellular proteins

Abstract
It has become standard practice to compare new amino-acid and nucleotide sequences with existing ones in the rapidly growing sequence databases. This has led to the recurring identification of certain sequence patterns, usually corresponding to less than 300 amino-acids in length. Many of these identifiable sequence regions have been shown to fold up to form a ‘domain’ structure; they are often called protein ‘modules’ (see definitions below). Proteins that contain such modules are widely distributed in biology, but they are particularly common in extracellular proteins.