PROCAIN: protein profile comparison with assisting information
Open Access
- 7 April 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 37 (11) , 3522-3530
- https://doi.org/10.1093/nar/gkp212
Abstract
Detection of remote sequence homology is essential for the accurate inference of protein structure, function and evolution. The most sensitive detection methods involve the comparison of evolutionary patterns reflected in multiple sequence alignments (MSAs) of protein families. We present PROCAIN, a new method for MSA comparison based on the combination of ‘vertical’ MSA context (substitution constraints at individual sequence positions) and ‘horizontal’ context (patterns of residue content at multiple positions). Based on a simple and tractable profile methodology and primitive measures for the similarity of horizontal MSA patterns, the method achieves the quality of homology detection comparable to a more complex advanced method employing hidden Markov models (HMMs) and secondary structure (SS) prediction. Adding SS information further improves PROCAIN performance beyond the capabilities of current state-of-the-art tools. The potential value of the method for structure/function predictions is illustrated by the detection of subtle homology between evolutionary distant yet structurally similar protein domains. ProCAIn, relevant databases and tools can be downloaded from: http://prodata.swmed.edu/procain/download. The web server can be accessed at http://prodata.swmed.edu/procain/procain.php.Keywords
This publication has 35 references indexed in Scilit:
- A comprehensive system for evaluation of remote sequence similarity detectionBMC Bioinformatics, 2007
- SCOP: A structural classification of proteins database for the investigation of sequences and structuresPublished by Elsevier ,2006
- The Impact of Structural Genomics: Expectations and OutcomesScience, 2006
- PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates PhylogenyPLoS Computational Biology, 2005
- Identification of common molecular subsequencesPublished by Elsevier ,2004
- Within the twilight zone: a sensitive profile-profile comparison tool based on information theoryJournal of Molecular Biology, 2002
- Conserved Tyr residues determine functions ofAlicyclobacillus acidocaldariussqualene–hopene cyclaseFEMS Microbiology Letters, 2000
- PSIC: profile extraction from sequence alignments with position-specific counts of independent observationsProtein Engineering, Design and Selection, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Reconstructing history with amino acid sequences1Protein Science, 1992