Similarity query processing using disk arrays
- 1 June 1998
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGMOD Record
- Vol. 27 (2) , 225-236
- https://doi.org/10.1145/276305.276325
Abstract
Similarity queries are fundamental operations that are used extensively in many modern applications, whereas disk arrays are powerful storage media of increasing importance. The basic trade-off in similarity query processing in such a system is that increased parallelism leads to higher resource consumptions and low throughput, whereas low parallelism leads to higher response times. Here, we propose a technique which is based on a careful investigation of the currently available data in order to exploit parallelism up to a point, retaining low response times during query processing. The underlying access method is a variation of the R*-tree, which is distributed among the components of a disk array, whereas the system is simulated using event-driven simulation. The performance results conducted, demonstrate that the proposed approach outperforms by factors a previous branch-and-bound algorithm and a greedy algorithm which maximizes parallelism as much as possible. Moreover, the comparison of the proposed algorithm to a hypothetical (non-existing) optimal one (with respect to the number of disk accesses) shows that the former is on average two times slower than the latter.Keywords
This publication has 18 references indexed in Scilit:
- A model for the prediction of R-tree performancePublished by Association for Computing Machinery (ACM) ,1996
- Nearest neighbor queriesPublished by Association for Computing Machinery (ACM) ,1995
- RAID: high-performance, reliable secondary storageACM Computing Surveys, 1994
- Beyond uniformity and independencePublished by Association for Computing Machinery (ACM) ,1994
- Towards an analysis of range query performance in spatial data structuresPublished by Association for Computing Machinery (ACM) ,1993
- Parallel R-treesPublished by Association for Computing Machinery (ACM) ,1992
- Probability distributions for seek time evaluationInformation Sciences, 1992
- The R*-tree: an efficient and robust access method for points and rectanglesPublished by Association for Computing Machinery (ACM) ,1990
- R-treesPublished by Association for Computing Machinery (ACM) ,1984
- An Algorithm for Finding Best Matches in Logarithmic Expected TimeACM Transactions on Mathematical Software, 1977