Utilizing logical relationships in genomic data to decipher cellular processes

Abstract
The wealth of available genomic data has spawned a corresponding interest in computational methods that can impart biological meaning and context to these experiments. Traditional computational methods have drawn relationships between pairs of proteins or genes based on notions of equality or similarity between their patterns of occurrence or behavior. For example, two genes displaying similar variation in expression, over a number of experiments, may be predicted to be functionally related. We have introduced a natural extension of these approaches, instead identifying logical relationships involving triplets of proteins. Triplets provide for various discrete kinds of logic relationships, leading to detailed inferences about biological associations. For instance, a protein C might be encoded within an organism if, and only if, two other proteins A and B are also both encoded within the organism, thus suggesting that gene C is functionally related to genes A and B. The method has been applied fruitfully to both phylogenetic and microarray expression data, and has been used to associate logical combinations of protein activity with disease state phenotypes, revealing previously unknown ternary relationships among proteins, and illustrating the inherent complexities that arise in biological data.