A structure and evolution-guided Monte Carlo sequence selection strategy for multiple alignment-based analysis of proteins
Open Access
- 22 November 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 22 (2) , 149-156
- https://doi.org/10.1093/bioinformatics/bti791
Abstract
Motivation: Various multiple sequence alignment-based methods have been proposed to detect functional surfaces in proteins, such as active sites or protein interfaces. The effect that the choice of sequences has on the conclusions of such analysis has seldom been discussed. In particular, no method has been discussed in terms of its ability to optimize the sequence selection for the reliable detection of functional surfaces. Results: Here we propose, for the case of proteins with known structure, a heuristic Metropolis Monte Carlo strategy to select sequences from a large set of homologues, in order to improve detection of functional surfaces. The quantity guiding the optimization is the clustering of residues which are under increased evolutionary pressure, according to the sample of sequences under consideration. We show that we can either improve the overlap of our prediction with known functional surfaces in comparison with the sequence similarity criteria of selection or match the quality of prediction obtained through more elaborate non-structure based-methods of sequence selection. For the purpose of demonstration we use a set of 50 homodimerizing enzymes which were co-crystallized with their substrates and cofactors. Contact:imihalek@bcm.tmc.edu Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 29 references indexed in Scilit:
- Improved prediction of protein-protein binding sites using a support vector machines approachBioinformatics, 2004
- Are protein–protein interfaces more conserved in sequence than the rest of the protein surface?Protein Science, 2004
- Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteinsNucleic Acids Research, 2003
- Analysis of Catalytic Residues in Enzyme Active SitesJournal of Molecular Biology, 2002
- Prediction of protein–protein interaction sites in heterocomplexes with neural networksEuropean Journal of Biochemistry, 2002
- Residues participating in the protein folding nucleus do not exhibit preferential evolutionary conservationJournal of Molecular Biology, 2002
- Identification of protein oligomerization states by analysis of interface conservationProceedings of the National Academy of Sciences, 2001
- The Protein Data BankNucleic Acids Research, 2000
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- The subunit interfaces of oligomeric enzymes are conserved to a similar extent to the overall protein sequencesProtein Science, 1994