PSIC: profile extraction from sequence alignments with position-specific counts of independent observations
Open Access
- 1 May 1999
- journal article
- research article
- Published by Oxford University Press (OUP) in Protein Engineering, Design and Selection
- Vol. 12 (5) , 387-394
- https://doi.org/10.1093/protein/12.5.387
Abstract
Sequence weighting techniques are aimed at balancing redundant observed information from subsets of similar sequences in multiple alignments. Traditional approaches apply the same weight to all positions of a given sequence, hence equal efficiency of phylogenetic changes is assumed along the whole sequence. This restrictive assumption is not required for the new method PSIC (position-specific independent counts) described in this paper. The number of independent observations (counts) of an amino acid type at a given alignment position is calculated from the overall similarity of the sequences that share the amino acid type at this position with the help of statistical concepts. This approach allows the fast computation of position-specific sequence weights even for alignments containing hundreds of sequences. The PSIC approach has been applied to profile extraction and to the fold family assignment of protein sequences with known structures. Our method was shown to be very productive in finding distantly related sequences and more powerful than Hidden Markov Models or the profile methods in WiseTools and PSI-BLAST in many cases. The profile extraction routine is available on the WWW (http://www.bork.embl-heidelberg.de/PSIC or http://www.imb.ac.ru/PSIC).Keywords
This publication has 21 references indexed in Scilit:
- Position-based sequence weightsPublished by Elsevier ,2004
- Sequence properties of GPI-anchored proteins near the omega-site: constraints for the polypeptide binding site of the putative transamidaseProtein Engineering, Design and Selection, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Modeling residue usage in aligned protein sequences via maximum likelihoodMolecular Biology and Evolution, 1996
- PairWise and SearchWise: Finding the Optimal Alignment in a Simultaneous Comparison of a Protein Profile against All DNA Translation FramesNucleic Acids Research, 1996
- [11] Applying motif and profile searchesPublished by Elsevier ,1996
- Maximum Discrimination Hidden Markov Models of Sequence ConsensusJournal of Computational Biology, 1995
- Volume changes in protein evolutionJournal of Molecular Biology, 1994
- Weights for data related by a treeJournal of Molecular Biology, 1989
- Profile analysis: detection of distantly related proteins.Proceedings of the National Academy of Sciences, 1987