Sampling from spatial databases

30 December 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 5 (1) , 199-208
https://doi.org/10.1109/icde.1993.344062

Abstract

This paper deals with techniques for obtaining random point samples from spatial databases. We seek random points from a continuous domain (usually ℝ) which satisfy a spatial predicate that is represented in the database as a collection of polygons. Several applications of spatial sampling (e.g. environmental monitoring, agronomy, forestry, etc) are described. Sampling problems are characterized in terms of two key parameters: (selectivity), and (overlap). We discuss two fundamental approaches to sampling with spatial predicates, depending on whether we sample first or evaluate the predicate first. The approaches are described in the context of both quadtrees and R-trees, detailing the sample first, acceptance/rejection tree, and partial area tree algorithms. A sequential algorithm, the is also described. The relative performance of the various sampling algorithms is compared and choice of preferred algorithms is suggested. We conclude with a short discussion of possible extensions.

Keywords

This publication has 8 references indexed in Scilit:

Random sampling from databases: a survey
Statistics and Computing, 1995
Processing aggregate relational queries with hard time constraints
Published by Association for Computing Machinery (ACM) ,1989
Statistical estimators for relational algebra expressions
Published by Association for Computing Machinery (ACM) ,1988
Sampling Theory for Forest Inventory
Published by Springer Nature ,1986
Random sampling with a reservoir
ACM Transactions on Mathematical Software, 1985
The Quadtree and Related Hierarchical Data Structures
ACM Computing Surveys, 1984
R-trees
Published by Association for Computing Machinery (ACM) ,1984
An Efficient Method for Weighted Sampling without Replacement
SIAM Journal on Computing, 1980