Transmembrane helix predictions revisited

1 December 2002

journal article
Published by Wiley in Protein Science

Vol. 11 (12) , 2774-2791
https://doi.org/10.1110/ps.0214502

Abstract

Methods that predict membrane helices have become increasingly useful in the context of analyzing entire proteomes, as well as in everyday sequence analysis. Here, we analyzed 27 advanced and simple methods in detail. To resolve contradictions in previous works and to reevaluate transmembrane helix prediction algorithms, we introduced an analysis that distinguished between performance on redundancy-reduced high- and low-resolution data sets, established thresholds for significant differences in performance, and implemented both per-segment and per-residue analysis of membrane helix predictions. Although some of the advanced methods performed better than others, we showed in a thorough bootstrapping experiment based on various measures of accuracy that no method performed consistently best. In contrast, most simple hydrophobicity scale-based methods were significantly less accurate than any advanced method as they overpredicted membrane helices and confused membrane helices with hydrophobic regions outside of membranes. In contrast, the advanced methods usually distinguished correctly between membrane-helical and other proteins. Nonetheless, few methods reliably distinguished between signal peptides and membrane helices. We could not verify a significant difference in performance between eukaryotic and prokaryotic proteins. Surprisingly, we found that proteins with more than five helices were predicted at a significantly lower accuracy than proteins with five or fewer. The important implication is that structurally unsolved multispanning membrane proteins, which are often important drug targets, will remain problematic for transmembrane helix prediction algorithms. Overall, by establishing a standardized methodology for transmembrane helix prediction evaluation, we have resolved differences among previous works and presented novel trends that may impact the analysis of entire proteomes.

Keywords

This publication has 122 references indexed in Scilit:

Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen
Journal of Molecular Biology, 2001
Kinase Inhibitors in Cancer Therapy
Drugs, 2000
The Protein Data Bank
Nucleic Acids Research, 2000
Turns in transmembrane helices: determination of the minimal length of a “helical hairpin” and derivation of a fine-grained turn propensity scale 1 1Edited by F. E. Cohen
Journal of Molecular Biology, 1999
Genome‐wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms
Protein Science, 1998
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Research, 1997
Selection of representative protein data sets
Protein Science, 1992
Solvation energy in protein folding and binding
Nature, 1986
Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features
Biopolymers, 1983
Prediction of protein antigenic determinants from amino acid sequences.
Proceedings of the National Academy of Sciences, 1981