Statistics of single‐molecule measurements: Applications in flow‐cytometry sizing of DNA fragments

Abstract
Background: The measurement of physical properties from single molecules has been demonstrated. However, the majority of single‐molecule studies report values based on relatively large data sets (e.g., N > 50). While there are studies that report physical quantities based on small sample sets, there has not been a detailed statistical analysis relating sample size to the reliability of derived parameters.Methods: Monte Carlo simulations and multinomial analysis, dependent on quantifiable experimental parameters, were used to determine the minimum number of single‐molecule measurements required to produce an accurate estimate of a population mean. Simulation results were applied to the fluorescence‐based sizing of DNA fragments by ultrasensitive flow cytometry (FCM).Results: Our simulations show, for an analytical technique with a 10% CV, that the average of as few as five single‐molecule measurements would provide a mean value within one SD of the population mean. Additional simulations determined the number of measurements required to obtain the desired number of replicates for each subpopulation within a mixture. Application of these results to flow cytometry data for λ/HindIII and S. aureus Mu50/SmaI DNA digests produced accurate DNA fingerprints from as few as 98 single‐molecule measurements.Conclusions: A surprisingly small number of single‐molecule measurements are required to obtain a mean measurement descriptive of a normally‐distributed parent population.