Automated generation and refinement of protein signatures: case study with G-protein coupled receptors

Abstract
Motivation: Previous work had established that it was possible to derive sparse signatures (essentially sequence-length motifs) by examining points of contact between residues in proteins of known three-dimensional (3D) structure. Many interesting protein families have very little tertiary structural information. Methods for deriving signatures using only primary and secondary-structural information were therefore developed. Results: Two methods for deriving protein signatures using protein sequence information and predicted secondary structures are described. One method is based on a scoring approach, the other on the Genetic Algorithm (GA). The effectiveness of the method was tested on the superfamily of GPCRs and compared with the established hidden Markov model (HMM) method. The signature method is shown to perform well, detecting 68% of superfamily members before the first false positive sequence and detecting several distant relationships. The GA population was used to provide information on alignment regions of particular importance for selection of key residues. Contact: howard@bmb.leeds.ac.uk Supplementary information: Software developed for this project and further data are available online (http://bbsrc-bioinf.leeds.ac.uk/BIOINF/jhp/)

This publication has 0 references indexed in Scilit: