Four Strikes Against Physical Mapping of DNA

Abstract
Physical mapping is a central problem in molecular biology and the human genome project. The problem is to reconstruct the relative position of fragments of DNA along the genome from information on their pairwise overlaps. We show that four simplified models of the problem lead to NP-complete decision problems: Colored unit interval graph completion, the maximum interval (or unit interval) subgraph, the pathwidth of a bipartite graph, and the k -consecutive ones problem for k ≥ 2. These models have been chosen to reflect various features typical in biological data, including false-negative and positive errors, small width of the map, and chimericism. Key words: physical mapping; NP-completeness; interval graphs; k-consecutive ones problem