Abstract
The use of expert systems to interpret short tandem repeat DNA profiles in forensic, medical and ancient DNA applications is becoming increasingly prevalent as high-throughput analytical systems generate large amounts of data that are time-consuming to process. With special reference to low copy number (LCN) applications, we use a graphical model to simulate stochastic variation associated with the entire DNA process starting with extraction of sample, followed by the processing associated with the preparation of a PCR reaction mixture and PCR itself. Each part of the process is modelled with input efficiency parameters. Then, the key output parameters that define the characteristics of a DNA profile are derived, namely heterozygote balance (Hb) and the probability of allelic drop-out p(D). The model can be used to estimate the unknown efficiency parameters, such as pi(extraction). 'What-if' scenarios can be used to improve and optimize the entire process, e.g. by increasing the aliquot forwarded to PCR, the improvement expected to a given DNA profile can be reliably predicted. We demonstrate that Hb and drop-out are mainly a function of stochastic effect of pre-PCR molecular selection. Whole genome amplification is unlikely to give any benefit over conventional PCR for LCN.