Allelic variation in the contiguous loci encoding Candida albicans ALS5, ALS1 and ALS9

Abstract
The ALS gene family ofCandida albicansconsists of eight genes (ALS1toALS7andALS9) that encode cell-wall glycoproteins involved in adhesion to host surfaces. Considerable allelic sequence variability has been documented for regions of ALS genes encoding repeated sequences. Although regions of ALS genes encoding non-repeated sequences tend to be more conserved, some sequence divergence has been noted, particularly for alleles ofALS5. Data from theC. albicansgenome sequencing project provided the first indication that strain SC5314 encoded two divergentALS9-like sequences and that three of the ALS genes (ALS5,ALS1andALS9) were contiguous on chromosome 6. Data from PCR analysis and construction of both single and double deletion mutants indicated that the divergent sequences were alleles ofALS9, and located downstream ofALS5andALS1. Sequences within the 5′ domain ofALS9-1andALS9-2varied by 11 %. Within the 3′ domain of each allele, extra nucleotides were present in two regions ofALS9-2, designated Variable Block 1 (VB1) and Variable Block 2 (VB2). Analysis of strains from the five majorC. albicansgenetic clades showed that bothALS9alleles are widespread among these strains, that the sequences ofALS9-1andALS9-2are conserved among diverse strains and that recombinantALS9alleles have been generated duringC. albicansevolution. Phylogenetic analysis showed that, although divergent in sequence,ALS9alleles are more similar to each other than to any other ALS genes. The degree of sequence divergence forALS9greatly exceeds that observed previously for other ALS genes and may result in functional differences for the proteins encoded by the two alleles.