Alignments grow, secondary structure prediction improves
- 6 December 2001
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 46 (2) , 197-205
- https://doi.org/10.1002/prot.10029
Abstract
Using information from sequence alignments significantly improves protein secondary structure prediction. Typically, more divergent profiles yield better predictions. Recently, various groups have shown that accuracy can be improved significantly by using PSI‐BLAST profiles to develop new prediction methods. Here, we focused on the influences of various alignment strategies on two 8‐year‐old PHD methods. The following results stood out. (i) PHD using pairwise alignments predicts about 72% of all residues correctly in one of the three states: helix, strand, and other. Using larger databases and PSI‐BLAST raised accuracy to 75%. (ii) More than 60% of the improvement originated from the growth of current sequence databases; about 20% resulted from detailed changes in the alignment procedure (substitution matrix, thresholds, and gap penalties). Another 20% of the improvement resulted from carefully using iterated PSI‐BLAST searches. (iii) It is of interest that we failed to improve prediction accuracy further when attempting to refine the alignment by dynamic programming (MaxHom and ClustalW). (iv) Improvement through family growth appears to saturate at some point. However, most families have not reached this saturation. Hence, we anticipate that prediction accuracy will continue to rise with database growth. Proteins 2002;46:197–205.Keywords
This publication has 58 references indexed in Scilit:
- The Protein Data BankNucleic Acids Research, 2000
- Protein secondary structure prediction based on position-specific scoring matrices 1 1Edited by G. Von HeijneJournal of Molecular Biology, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Prediction of Protein Secondary Structure by Combining Nearest-neighbor Algorithms and Multiple Sequence AlignmentsJournal of Molecular Biology, 1995
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- Prediction of Transmembrane Segments in Proteins Utilising Multiple Sequence AlignmentsJournal of Molecular Biology, 1994
- Bona Fide Prediction of Aspects of Protein Conformation: Assigning Interior and Surface Residues from Patterns of Variation and Conservation in Homologous Protein SequencesJournal of Molecular Biology, 1994
- Prediction of Protein Secondary Structure at Better than 70% AccuracyJournal of Molecular Biology, 1993
- Predicting Coiled Coils from Protein SequencesScience, 1991
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983