Abstract
Analysis of the sequence data available today, comprising more than 500,000 bases, confirms the previously observed phenomenon that there are distinct dinucleotide preferences in DNA sequences. Consistent behaviour is observed in the major sequence groups analysed here in prokaryotes, eukaryotes and mitochondria. Some doublet preferences are common to all groups and are found in most sequences of the Los Alamos Library. The patterns seen in such large data sets are very significant statistically and biologically. Since they are present in numerous and diverse nucleotide sequences, one may conclude that they confer evolutionary advantages on the organism. In eukaryotes RR and YY dinucleotides are preferred over YR and RY (where R is a purine and Y a pyrimidine). Since opposite-chain nearest-neighbour purine clashes are major determinants of DNA structure, it appears that the tight packaging of DNA in nucleosomes disfavors, in general, such (YR and RY) steric repulsion.