ESTIMATING POPULATION SIZE WHEN DUPLICATES ARE PRESENT

15 August 1996

journal article
research article
Published by Wiley in Statistics in Medicine

Vol. 15 (15) , 1635-1646
https://doi.org/10.1002/(sici)1097-0258(19960815)15:15<1635::aid-sim337>3.0.co;2-t

Abstract

Each of K mental health programmes reports the number of patients served in a year. The sum of these numbers, y, is an overcount because some patients are seen in more than one programme. Health care planners need to know the unduplicated number served by the mental health system. Thus, there is an unknown number, M, of distinct individuals who appear on one or more of K lists; some appear on multiple lists and the duplicates are not readily identifiable. Let X be the number of lists on which a randomly selected individual appears. When E(X) is known, y/E(X) is the natural estimator of M. We assume that we know the number of programmes, X_i, used by the ith individual in a random sample of recipients of service. Here, the intuitive estimator, Y/X¯ has desirable statistical properties. We give confidence interval estimators for M. We apply the method to estimate the number of individuals served in 1991 by the mental health programmes in New York State.

Keywords

This publication has 0 references indexed in Scilit: