Characteristics of the Large (dA).(dT) Homopolymer Tracts in D. discoideum Gene Flanking and Intron Sequences

Abstract
D. discoideum, the slime mold, is one of the most AT rich eukaryotic genomes known. In this paper we examine this organism's database for overlapping N-tuples of high frequency and find A and T tracts possess among the highest frequencies in flanking sequences but not in coding sequences. We examined both overlapping and non-overlapping frequencies of the A T, G and C homopolymer tracts of 2>N>6. Overlapping (dG)·(dC) and (dA)·(dT) tracts occur at greater frequencies than expected, based on random occurrence. Long (dA)·(dT) tracts of N< 10 occur at well above expected frequencies in flanking and intron regions, while (dG)·(dC) tracts above N=5 are rarely found. Some of the implications of these findings for tract origins in slip-strand replication and for chromatin stucture are discussed.