Fast Similarity Search in Three-Dimensional Structure Databases

Abstract
Given a database 𝒟 of three-dimensional (3D) molecular structures and a target molecule Q, the similarity search problem is to find the molecules O in 𝒟 that match Q after allowing for an arbitrary number of whole-structure rotations and translations as well as a certain number of edit operations. The edit operations include relabeling an atom, deleting an atom, and inserting an atom. This search operation arises in many biochemical applications. In this paper we study the similarity search problem and a class of related queries. We present a computer vision based technique, called geometric hashing, for processing these queries. Experimental results on a database of 3D molecular structures obtained from the National Cancer Institute indicate the good performance of the presented technique.

This publication has 17 references indexed in Scilit: