Using evolutionary information for the query and target improves fold recognition
- 22 October 2003
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 54 (2) , 342-350
- https://doi.org/10.1002/prot.10565
Abstract
In this study, we show that it is possible to increase the performance over PSI‐BLAST by using evolutionary information for both query and target sequences. This information can be used in three different ways: by sequence linking, profile–profile alignments, and by combining sequence–profile and profile–sequence searches. If only PSI‐BLAST is used, 16% of superfamily‐related protein domains can be detected at 90% specificity, but if a sequence–profile and a profile–sequence search are combined, this is increased to 20%, profile–profile searches detects 19%, whereas a linking procedure identifies 22% of these proteins. All three methods show equal performance, but the best combination of speed and accuracy seems to be obtained by the combined searches, because this method shows a good performance even at high specificity and the lowest computational cost. In addition, we show that the E‐values reported by all these methods, including PSI‐BLAST, underestimate the true rate of false positives. This behavior is seen even if a very strict E‐value cutoff and a limited number of iterations are used. However, the difference is more pronounced with a looser E‐value cutoff and more iterations. Proteins 2003;53:000–000.Keywords
This publication has 32 references indexed in Scilit:
- Identification of common molecular subsequencesPublished by Elsevier ,2004
- Within the twilight zone: a sensitive profile-profile comparison tool based on information theoryJournal of Molecular Biology, 2002
- LiveBench‐1: Continuous benchmarking of protein structure prediction serversProtein Science, 2001
- Comparison of sequence profiles. Strategies for structural predictions using sequence informationProtein Science, 2000
- Identification of related proteins on family, superfamily and fold level 1 1Edited by F. C. CohenJournal of Molecular Biology, 2000
- Genome analysis: Assigning protein coding regions to three‐dimensional structuresProtein Science, 1999
- Intermediate sequences increase the detection of homology between sequencesJournal of Molecular Biology, 1997
- Do aligned sequences share the same fold?Journal of Molecular Biology, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- An evolutionary treasure: unification of a broad set of amidohydrolases related to ureaseProteins-Structure Function and Bioinformatics, 1997