Empirically controlled mapping of the Caenorhabditis elegans protein-protein interactome network

Abstract
High-throughput yeast two-hybrid screening is used to generate the largest C. elegans interactome resource available thus far. Using an empirical quality control framework presented in Venkatesan et al., also online, the data set is evaluated for quality and is used to estimate the total size of the worm interactome. To provide accurate biological hypotheses and elucidate global properties of cellular networks, systematic identification of protein-protein interactions must meet high quality standards. We present an expanded C. elegans protein-protein interaction network, or 'interactome' map, derived from testing a matrix of ∼10,000 × ∼10,000 proteins using a highly specific, high-throughput yeast two-hybrid system. Through a new empirical quality control framework, we show that the resulting data set (Worm Interactome 2007, or WI-2007) was similar in quality to low-throughput data curated from the literature. We filtered previous interaction data sets and integrated them with WI-2007 to generate a high-confidence consolidated map (Worm Interactome version 8, or WI8). This work allowed us to estimate the size of the worm interactome at ∼116,000 interactions. Comparison with other types of functional genomic data shows the complementarity of distinct experimental approaches in predicting different functional relationships between genes or proteins.