Optimum algorithms for two random sampling problems
- 1 November 1983
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 02725428,p. 65-75
- https://doi.org/10.1109/sfcs.1983.43
Abstract
Several fast new algorithms are presented for sampling n records at random from a file containing N records. The first problem we solve deals with sampling when N is known, and the the second problem considers the case when N is unknown. The two main results in this paper are Algorithms D and Z. Algorithm D solves the first problem by doing the sampling with a small constant amount of space and in O(n) time, on the average; roughly n uniform random variates are generated, and approximately n exponentiation operations are performed during the sampling The sample is selected sequentially and online; it answers an open problem in [Knuth 81]. Algorithm Z solves the second problem by doing the sampling using O(n) space, roughly n ln(N/n) uniform random variates and O(n(1 + log(N/n))) time, on the average. Both algorithms are time- and space-optimum and are short and easy to implement.Keywords
This publication has 3 references indexed in Scilit:
- Generating Sorted Lists of Random NumbersACM Transactions on Mathematical Software, 1980
- A note on sampling a tape-fileCommunications of the ACM, 1962
- Development of Sampling Plans by Using Sequential (Item by Item) Selection Techniques and Digital ComputersJournal of the American Statistical Association, 1962