Optimal Reconstruction of a Sequence from its Probes
- 1 October 1999
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 6 (3-4) , 361-368
- https://doi.org/10.1089/106652799318328
Abstract
An important combinatorial problem, motivated by DNA sequencing in molecular biology, is the reconstruction of a sequence over a small finite alphabet from the collection of its probes (the sequence spectrum), obtained by sliding a fixed sampling pattern over the sequence. Such construction is required for Sequencing-by-Hybridization (SBH), a novel DNA sequencing technique based on an array (SBH chip) of short nucleotide sequences (probes). Once the sequence spectrum is biochemically obtained, a combinatorial method is used to reconstruct the DNA sequence from its spectrum. Since technology limits the number of probes on the SBH chip, a challenging combinatorial question is the design of a smallest set of probes that can sequence an arbitrary DNA string of a given length. We present in this work a novel probe design, crucially based on the use of universal bases [bases that bind to any nucleotide (Loakes and Brown, 1994)] that drastically improves the performance of the SBH process and asymptotically approaches the information-theoretic bound up to a constant factor. Furthermore, the sequencing algorithm we propose is substantially simpler than the Eulerian path method used in previous solutions of this problem.Keywords
This publication has 8 references indexed in Scilit:
- Poisson Process Approximation for Sequence Repeats, and Sequencing by HybridizationJournal of Computational Biology, 1996
- 5-Nitroindole as an universal base analogueNucleic Acids Research, 1994
- The Probability of Unique Solutions of Sequencing by HybridizationJournal of Computational Biology, 1994
- Improved Chips for Sequencing by HybridizationJournal of Biomolecular Structure and Dynamics, 1991
- l-Tuple DNA Sequencing: Computer AnalysisJournal of Biomolecular Structure and Dynamics, 1989
- Sequencing of megabase plus DNA by hybridization: Theory of the methodGenomics, 1989
- Two Moments Suffice for Poisson Approximations: The Chen-Stein MethodThe Annals of Probability, 1989
- A novel method for nucleic acid sequence determinationJournal of Theoretical Biology, 1988