Prediction of unfolded segments in a protein sequence based on amino acid composition
Open Access
- 18 January 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (9) , 1891-1900
- https://doi.org/10.1093/bioinformatics/bti266
Abstract
Motivation: Partially and wholly unstructured proteins have now been identified in all kingdoms of life—more commonly in eukaryotic organisms. This intrinsic disorder is related to certain critical functions. Apart from their fundamental interest, unstructured regions in proteins may prevent crystallization. Therefore, the prediction of disordered regions is an important aspect for the understanding of protein function, but may also help to devise genetic constructs. Results: In this paper we present a computational tool for the detection of unstructured regions in proteins based on two properties of unfolded fragments: (1) disordered regions have a biased composition and (2) they usually contain either small or no hydrophobic clusters. In order to quantify these two facts we first calculate the amino acid distributions in structured and unstructured regions. Using this distribution, we calculate for a given sequence fragment the probability to be part of either a structured or an unstructured region. For each amino acid, the distance to the nearest hydrophobic cluster is also computed. Using these three values along a protein sequence allows us to predict unstructured regions, with very simple rules. This method requires only the primary sequence, and no multiple alignment, which makes it an adequate method for orphan proteins. Availability:http://genomics.eu.org/ Contact:Anne.Poupon@ibbmc.u-psud.frKeywords
This publication has 43 references indexed in Scilit:
- Reduced amino acid alphabet is sufficient to accurately recognize intrinsically disordered proteinFEBS Letters, 2004
- Improved Prediction of Signal Peptides: SignalP 3.0Journal of Molecular Biology, 2004
- Improving Profile HMM Discrimination by Adapting Transition ProbabilitiesJournal of Molecular Biology, 2004
- Prediction and Functional Analysis of Native Disorder in Proteins from the Three Kingdoms of LifeJournal of Molecular Biology, 2004
- The Pfam protein families databaseNucleic Acids Research, 2004
- Predicting intrinsic disorder from amino acid sequenceProteins-Structure Function and Bioinformatics, 2003
- Extended disordered proteins: targeting function with less scaffoldTrends in Biochemical Sciences, 2003
- Loopy Proteins Appear Conserved in EvolutionJournal of Molecular Biology, 2002
- Solvent Mediated Interactions in the Structure of the Nucleosome Core Particle at 1.9Å ResolutionJournal of Molecular Biology, 2002
- Identification of sequence motifs from a set of porteins with related functionProtein Engineering, Design and Selection, 1994