A Novel method for GPCR recognition and family classification from sequence alone using signatures derived from profile hidden Markov models
- 1 October 2003
- journal article
- research article
- Published by Taylor & Francis in SAR and QSAR in Environmental Research
- Vol. 14 (5-6) , 413-420
- https://doi.org/10.1080/10629360310001623999
Abstract
G-protein coupled receptors (GPCRs) constitute a broad class of cell-surface receptors, including several functionally distinct families, that play a key role in cellular signalling and regulation of basic physiological processes. GPCRs are the focus of a significant amount of current pharmaceutical research since they interact with more than 50% of prescription drugs, whereas they still comprise the best potential targets for drug design. Taking into account the excess of data derived by genome sequencing projects, the use of computational tools for automated characterization of novel GPCRs is imperative. Typical computational strategies for identifying and classifying GPCRs involve sequence similarity searches (e.g. BLAST) coupled with pattern database analysis (e.g. PROSITE, BLOCKS). The diagnostic method presented here is based on a probabilistic approach that exploits highly discriminative profile Hidden Markov Models, excised from low entropy regions of multiple sequence alignments, to derive potent family signatures. For a given query, a P-value is obtained, combining individual hits derived from the same family. Hence a best-guess family membership is depicted, allowing GPCRs' classification at a family level, solely using primary structure information. A web-based version of the application is freely available at URL: http://bioinformatics.biol.uoa.gr/PRED-GPCR.Keywords
This publication has 21 references indexed in Scilit:
- The InterPro Database, 2003 brings increased coverage and new featuresNucleic Acids Research, 2003
- Seven-transmembrane receptorsNature Reviews Molecular Cell Biology, 2002
- The Pfam Protein Families DatabaseNucleic Acids Research, 2002
- The PROSITE database, its status in 2002Nucleic Acids Research, 2002
- CAST: an iterative algorithm for the complexity analysis of sequence tractsBioinformatics, 2000
- Increased coverage of protein families with the Blocks Database serversNucleic Acids Research, 2000
- Phylogenetic information and experimental design in molecular systematicsProceedings Of The Royal Society B-Biological Sciences, 1998
- GPCRDB: an information system for G protein-coupled receptorsNucleic Acids Research, 1998
- Profile hidden Markov models.Bioinformatics, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997