CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
Open Access
- 11 November 1994
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 22 (22) , 4673-4680
- https://doi.org/10.1093/nar/22.22.4673
Abstract
The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to downweight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.Keywords
This publication has 32 references indexed in Scilit:
- Improved sensitivity of profile searches through the use of sequence weights and gap excisionBioinformatics, 1994
- Improved tools for biological sequence comparison.Proceedings of the National Academy of Sciences, 1988
- A strategy for the rapid multiple alignment of protein sequencesJournal of Molecular Biology, 1987
- Progressive sequence alignment as a prerequisitetto correct phylogenetic treesJournal of Molecular Evolution, 1987
- The neighbor-joining method: a new method for reconstructing phylogenetic trees.Molecular Biology and Evolution, 1987
- Determinants of a protein foldJournal of Molecular Biology, 1987
- A comprehensive set of sequence analysis programs for the VAXNucleic Acids Research, 1984
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- Comparative biosequence metricsJournal of Molecular Evolution, 1981
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970