Random sampling with a reservoir
- 1 March 1985
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Mathematical Software
- Vol. 11 (1) , 37-57
- https://doi.org/10.1145/3147.3165
Abstract
We introduce fast algorithms for selecting a random sample of n records without replacement from a pool of N records, where the value of N is unknown beforehand. The main result of the paper is the design and analysis of Algorithm Z; it does the sampling in one pass using constant space and in O ( n (1 + log( N/n ))) expected time, which is optimum, up to a constant factor. Several optimizations are studied that collectively improve the speed of the naive version of the algorithm by an order of magnitude. We give an efficient Pascal-like implementation that incorporates these modifications and that is suitable for general use. Theoretical and empirical results indicate that Algorithm Z outperforms current methods by a significant margin.Keywords
This publication has 4 references indexed in Scilit:
- Faster methods for random samplingCommunications of the ACM, 1984
- An Algorithm for Unbiased Random SamplingThe Computer Journal, 1982
- A note on sampling a tape-fileCommunications of the ACM, 1962
- Development of Sampling Plans by Using Sequential (Item by Item) Selection Techniques and Digital ComputersJournal of the American Statistical Association, 1962