Predicting protein function from sequence and structure

Abstract
'Inheritance through homology' is the most common and generally more accessible approach to function prediction, but orthology should be established where possible to improve confidence in predictions. The body of functional annotations of proteins is becoming increasingly computer-readable and is being organized in ways that can enhance the scope of in silico prediction methods. Significant advances in complete genome sequencing have resulted in a new generation of methods that exploit sequence analysis on the genome level. Curated protein family resources can often guide the assignment of protein functions and the detection of motifs or sequence patterns. New approaches are being developed to identify functional residues in proteins; these can then be applied to divide larger protein families into more specific functional subfamilies. There have been exciting new developments in databases of experimentally determined protein–protein interactions, as well as genomic inference methods for predicting these interactions. Non-homology-based function prediction methods that exploit the properties of sequences and not their evolutionary history are also becoming more successful. Recent Structural Genomics Initiatives (SGIs) are attempting to target functionally diverse relatives within protein families. Function prediction from structure can be achieved by global comparison of protein structures to detect homology or through the use of structural templates derived from the active sites of enzymes. It is also possible to explore the protein surface for sequence-conserved patches, clefts and electrostatic potentials. In general terms, it is best to seek and compare the results of several methods to predict the function of novel proteins. Meta-servers simplify this by providing easy access to a range of the best-performing methods. Future developments will see more efficient integration of prediction methods and experimental data; for example, microarrays, yeast two-hybrid screens and tandem affinity purification. Better understanding of the diversification of function in protein families will permit more sophisticated means of predicting function and functional networks.