The hybrid tree: an index structure for high dimensional feature spaces
- 1 January 1999
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 440-447
- https://doi.org/10.1109/icde.1999.754960
Abstract
Feature-based similarity searching is emerging as an important search paradigm in database systems. The technique used is to map the data items as points into a high-dimensional feature space which is indexed using a multidimensional data structure. Similarity searching then corresponds to a range search over the data structure. Although several data structures have been proposed for feature indexing, none of them is known to scale beyond 10-15 dimensional spaces. This paper introduces the hybrid tree-a multidimensional data structure for indexing high-dimensional feature spaces. Unlike other multidimensional data structures, the hybrid tree cannot be classified as either a pure data partitioning (DP) index structure (such as the R-tree, SS-tree or SR-tree) or a pure space partitioning (SP) one (such as the KDB-tree or hB-tree); rather it combines the positive aspects of the two types of index structures into a single data structure to achieve a search performance which is more scalable to high dimensionalities than either of the above techniques. Furthermore, unlike many data structures (e.g. distance-based index structures like the SS-tree and SR-tree), the hybrid tree can support queries based on arbitrary distance functions. Our experiments on "real" high-dimensional large-size feature databases demonstrate that the hybrid tree scales well to high dimensionality and large database sizes. It significantly outperforms both purely DP-based and SP-based index mechanisms as well as linear scans at all dimensionalities for large-sized databases.Keywords
This publication has 16 references indexed in Scilit:
- An implementation and performance analysis of spatial data access methodsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Tutorial 5: Indexing high-dimensional spaces: database support for next decade's applicationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2000
- Supporting ranked Boolean similarity queries in MARSIEEE Transactions on Knowledge and Data Engineering, 1998
- Supporting similarity queries in MARSPublished by Association for Computing Machinery (ACM) ,1997
- The SR-treePublished by Association for Computing Machinery (ACM) ,1997
- Distance-based indexing for high-dimensional metric spacesPublished by Association for Computing Machinery (ACM) ,1997
- The TV-tree: An index structure for high-dimensional dataThe VLDB Journal, 1994
- The hB-tree: a multiattribute indexing method with good guaranteed performanceACM Transactions on Database Systems, 1990
- The Grid FileACM Transactions on Database Systems, 1984
- R-treesPublished by Association for Computing Machinery (ACM) ,1984