A standard file format for data from DNA sequencing instruments

Abstract
There are now a number of machines for determining DNA sequences. These devices are currently of two types: those such as the Applied Biosystems 373A and the Pharmacia A.L.F. which interpret the sequences of samples as they run on gels within the machine, and those, such as the Bio-Rad and Amersham readers that scan and analyse conventional autoradiographs. Both types of machine can produce their data in the form of traces which represent the band intensity of each of the four base types at each position in the sequence. At present all the machines write files in different formats. We describe a machine independent formal for storing data derived from automatic sequencing machines. Files in this format can store the derived sequence, the traces and a set of confidence measures for each base. We have adopted the format as the standard for our sequence handling software.

This publication has 1 reference indexed in Scilit: