Prediction of integral membrane protein type by collocated hydrophobic amino acid pairs
- 19 June 2008
- journal article
- research article
- Published by Wiley in Journal of Computational Chemistry
- Vol. 30 (1) , 163-172
- https://doi.org/10.1002/jcc.21053
Abstract
A computational model, IMP‐TYPE, is proposed for the classification of five types of integral membrane proteins from protein sequence. The proposed model aims not only at providing accurate predictions but most importantly it incorporates interesting and transparent biological patterns. When contrasted with the best‐performing existing models, IMP‐TYPE reduces the error rates of these methods by 19 and 34% for two out‐of‐sample tests performed on benchmark datasets. Our empirical evaluations also show that the proposed method provides even bigger improvements, i.e., 29 and 45% error rate reductions, when predictions are performed for sequences that share low (40%) identity with sequences from the training dataset. We also show that IMP‐TYPE can be used in a standalone mode, i.e., it duplicates significant majority of correct predictions provided by other leading methods, while providing additional correct predictions which are incorrectly classified by the other methods. Our method computes predictions using a Support Vector Machine classifier that takes feature‐based encoded sequence as its input. The input feature set includes hydrophobic AA pairs, which were selected by utilizing a consensus of three feature selection algorithms. The hydrophobic residues that build up the AA pairs used by our method are shown to be associated with the formation of transmembrane helices in a few recent studies concerning integral membrane proteins. Our study also indicates that Met and Phe display a certain degree of hydrophobicity, which may be more crucial than their polarity or aromaticity when they occur in the transmembrane segments. This conclusion is supported by a recent study on potential of mean force for membrane protein folding and a study of scales for membrane propensity of amino acids. © 2008 Wiley Periodicals, Inc. J Comput Chem, 2009Keywords
This publication has 51 references indexed in Scilit:
- Structure and mechanism of the M2 proton channel of influenza A virusNature, 2008
- Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairsBMC Structural Biology, 2007
- The Structure of the ζζ Transmembrane Dimer Reveals Features Essential for Its Assembly with the T Cell ReceptorCell, 2006
- A knowledge‐based scale for amino acid membrane propensityProteins-Structure Function and Bioinformatics, 2002
- Prediction of protein cellular attributes using pseudo‐amino acid compositionProteins-Structure Function and Bioinformatics, 2001
- Statistical analysis of amino acid patterns in transmembrane helices: the GxxxG motif occurs frequently and in association with β-branched residues at neighboring positionsJournal of Molecular Biology, 2000
- Protein secondary structure prediction based on position-specific scoring matrices 1 1Edited by G. Von HeijneJournal of Molecular Biology, 1999
- Genome‐wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organismsProtein Science, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Non-random Distribution of Amino Acids in the Transmembrane Segments of Human Type I Single Span Membrane ProteinsJournal of Molecular Biology, 1993