Abstract
Methods are presented for calculating the number and type of different DNA sequences generated by base excision and insertion events at a given site in a known DNA sequence. We calculate, for example, that excision of the Mu1 transposon from the bz1::Mu1 allele of maize should generate more than 500,000 unique alleles given the extent of base deletion (up to 34 bases removed) and base insertion (0–5 bases) observed thus far in sequenced excision alleles. Analysis of this universe of potential alleles can, for example, be used to predict the frequency of creation of stop codons or repair-generated duplications. In general, knowledge of the distribution of alleles can be used to evaluate models of both excision and repair by determining whether particular events occur more frequently than expected. Such quantitative analysis complements the qualitative description provided by the DNA sequence of individual events. Similar methods can be used to evaluate the outcome of other cases of DNA breakage and repair such as programmed V(D)J recombination in immunoglobin genes.