Finding weak similarities between proteins by sequence profile comparison
Open Access
- 15 January 2003
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 31 (2) , 683-689
- https://doi.org/10.1093/nar/gkg154
Abstract
To improve the recognition of weak similarities between proteins a method of aligning two sequence profiles is proposed. It is shown that exploring the sequence space in the vicinity of the sequence with unknown properties significantly improves the performance of sequence alignment methods. Consistent with the previous observations the recognition sensitivity and alignment accuracy obtained by a profile–profile alignment method can be as much as 30% higher compared to the sequence–profile alignment method. It is demonstrated that the choice of score function and the diversity of the test profile are very important factors for achieving the maximum performance of the method, whereas the optimum range of these parameters depends on the level of similarity to be recognized.Keywords
This publication has 43 references indexed in Scilit:
- Use of receiver operating characteristic (ROC) analysis to evaluate sequence matchingPublished by Elsevier ,2002
- A comparison of position‐specific score matrices based on sequence and structure alignmentsProtein Science, 2002
- Within the twilight zone: a sensitive profile-profile comparison tool based on information theoryJournal of Molecular Biology, 2002
- CDD: a database of conserved domain alignments with links to domain three-dimensional structureNucleic Acids Research, 2002
- Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinementsNucleic Acids Research, 2001
- Fold Predictions for Bacterial GenomesJournal of Structural Biology, 2001
- Combination of threading potentials and sequence profiles improves fold recognitionJournal of Molecular Biology, 2000
- Comparison of sequence profiles. Strategies for structural predictions using sequence informationProtein Science, 2000
- Increased coverage of protein families with the Blocks Database serversNucleic Acids Research, 2000
- Metrics and similarity measures for hidden Markov models.1999