Co-polymer tracts in eukaryotic, prokaryotic, and organellar DNA

Abstract
Large variations in DNA base composition and noticeable strand asymmetries are known to occur between different organisms and within different regions of the genomes of single organisms. Apparently such composition and sequence biases occur to fulfill structural rather than informational requirements. Here we report the wide occurrence of a more subtle biasing of DNA sequence that can have structural consequences: an increase or a suppression of the number of long tracts of two-base co-polymers. Strong biases were observed when the DNA sequences of the longest eukaryotic, prokaryotic, and organellar entries in the GenBank data base (totaling 773 kilobases) were analyzed for the number of occurrences of tracts of the two-base co-polymers (A,T)n, (G,C)n, and (A,C)n as a function of tract length. (The expression (A,T)n is used here to denote an uninterrupted tract, n nucleotides in length, of A and T bases in any proportion or order, terminated at each end by a G or C residue.) Characteristic differences are also observed in tract biases of eukaryotic vs. prokaryotic organisms.