Comparison of different melting temperature calculation methods for short DNA sequences

Abstract
Motivation: The overall performance of several molecular biology techniques involving DNA/DNA hybridization depends on the accurate prediction of the experimental value of a critical parameter: the melting temperature Tm. Till date, many computer software programs based on different methods and/or parameterizations are available for the theoretical estimation of the experimental Tm value of any given short oligonucleotide sequence. However, in most cases, large and significant differences in the estimations of Tm were obtained while using different methods. Thus, it is difficult to decide which Tm value is the accurate one. In addition, it seems that most people who use these methods are unaware about the limitations, which are well described in the literature but not stated properly or restricted the inputs of most of the web servers and standalone software programs that implement them. Results: A quantitative comparison on the similarities and differences among some of the published DNA/DNA Tm calculation methods is reported. The comparison was carried out for a large set of short oligonucleotide sequences ranging from 16 to 30 nt long, which span the whole range of CG-content. The results showed that significant differences were observed in all the methods, which in some cases depend on the oligonucleotide length and CG-content in a non-trivial manner. Based on these results, the regions of consensus and disagreement for the methods in the oligonucleotide feature space were reported. Owing to the lack of sufficient experimental data, a fair and complete assessment of accuracy for the different methods is not yet possible. Inspite of this limitation, a consensus Tm with minimal error probability was calculated by averaging the values obtained from two or more methods that exhibit similar behavior to each particular combination of oligonucleotide length and CG-content class. Using a total of 348 DNA sequences in the size range between 16mer and 30mer, for which the experimental Tm data are available, we demonstrated that the consensus Tm is a robust and accurate measure. It is expected that the results of this work would be constituted as a useful set of guidelines to be followed for the successful experimental implementation of various molecular biology techniques, such as quantitative PCR, multiplex PCR and the design of optimal DNA microarrays. Availability: A binary software distribution to calculate the consensus Tm described in this work for thousands of oligonucleotides simultaneously for the LINUX operating system is freely available upon request to the authors or from our website http://protein.bio.puc.cl/melting-temperatures.html Contact: fmelo@bio.puc.cl Supplementary information: The large set of oligonucleotides, the detailed results of the comparative and accuracy benchmarks, and hundreds of comparative graphs generated during this work are available at our website http://protein.bio.puc.cl/melting-temperatures.html