The capsid protein of Semliki Forest virus has clusters of basic amino acids and prolines in its amino-terminal region.

Abstract
The amino acid sequence of the capsid (C) protein was deduced from the nucleotide sequence of the C gene. This part of the viral 42S RNA genome was transcribed into double-stranded cDNA. The cDNA was cloned in the Escherichia coli .chi. 1776-pBR322 host-vector system and then the base sequence was determined with the technique described by Maxam and Gilbert. The amino acid sequence of the C protein shows a clustering of basic amino acids and prolines within the first 110 amino acids.