Rapid motif-based prediction of circular permutations in multi-domain proteins

Abstract
Motivation: Rearrangements of protein domains and motifs such as swaps and circular permutations (CPs) can produce erroneous results in searching sequence databases when using traditional methods based on linear sequence alignments. Circular permutations are also of biological relevance because they can help to better understand both protein evolution and functionality. Results: We have developed an algorithm, RASPODOM, which is based on the classical recursive alignment scheme. Sequences are represented as strings of domains taken from precompiled resources of domain (motif) databases such as ProDom. The algorithm works several orders of magnitude faster than a reimplementation of the existing CP detection algorithm working on strings of amino acids, produces virtually no false positives and allows the discrimination of true CPs from ‘intermediate’ CPs (iCPs). Several true CPs which have not been reported in literature so far could be identified from Swiss-Prot/TrEMBL within minutes. Availability: Source codes, additional scripts, data and a web-based interface can be found on: http://www.uni-muenster.de/Biologie.Botanik/ebb/projects/raspodom/ Contact:ebb@uni-muenster.de