A Single Expression Site with a Conserved Leader Sequence Regulates Variation of Expression of thePneumocystis cariniiFamily of Major Surface Glycoprotein Genes

Abstract
The major surface glycoprotein (MSG) of Pneumocystis carinii is encoded by a family of related but distinct genes distributed throughout the P. carinii genome. Previous reports of the genomic and mRNA MSG structure suggested that there was a highly conserved 5′-untranslated region and a highly variable translated region. In the current study, we demonstrate that there is a single expression site for MSG expression and that different MSG genes are located downstream of this expression site. Isolation of a genomic clone containing the putative 5′-untranslated region has demonstrated that there was a single base sequencing error in what was considered to be the untranslated region. The corrected sequence reveals an extended open reading frame encoding a constant amino-terminal leader domain, with a typical signal peptide, for the MSG protein family. Since this constant amino-terminal domain is encoded by a single copy genomic sequence, a recombination/gene conversion-mediated antigenic switching event is required to effect the known variability in expressed MSG sequences. Therefore, like some bacterial and protozoan pathogens, the opportunistic fungal pathogen P. carinii contains a constant genomic site dedicated to MSG expression and a switchable downstream region for the variable part of the MSG gene family.