Reconstructing a three-dimensional model with arbitrary errors

Abstract
A number of current technologies allow for the determination of interatomic distance information in structures such as proteins and RNA. Thus, the reconstruction of a three-dimensional set of points using information about its interpoint distances has become a task of basic importance in determining molecular structure. The distance measurements one obtains from techniques such as NMR are typically sparse and error-prone, greatly complicating the reconstruction task. Many of these errors result in distance measurements that can be safely assumed to lie within certain fixed tolerances. But a number of sources of systematic error in these experiments lead to inaccuracies in the data that are very hard to quantify; in effect, one must treat certain entries of the measured distance matrix as being arbitrarily “corrupted.” The existence of arbitrary errors leads to an interesting sort of error-correction problem—how many corrupted entries in a distance matrix can be efficiently corrected to produce a consistent three-dimensional structure? For the case of an n × n matrix in which every entry is specified, we provide a randomized algorithm running in time O(n log n) that enumerates all structures consistent with at most (1/2-ε)n errors per row, with high probability. In the case of randomly located errors, we can correct errors of the same density in a sparse matrix-one in which only a β fraction of the entries in each row are given, for any constant βgt;0.

This publication has 14 references indexed in Scilit: