Treating grouped data as continuous in alcohol consumption measures

Abstract
In quantity‐frequency methods used for self‐report measurement of alcohol intake (or other exposures), respondents mark the appropriate ranges, e.g. ‘5 to 8 drinks’, ‘5 or 6 times per week’. To calculate average consumption only single values, not ranges, can be multiplied, and midpoints are commonly used. This results in bias if the range lies in the tail of a distribution, as often happens with drinks per occasion. The same bias occurs when risk, for example, is plotted against consumption levels, which inevitably are grouped into ranges. Consequently, estimates of aggregate consumption can be exaggerated and curves of risk against exposure level can be misleading. A method is described to calculate a relatively unbiased representative value for a range, requiring only knowledge of the normal distribution table, the log‐normal distribution, and basic arithmetic. Part of the procedure is also useful for estimating percentile points in data that have been grouped differently, suck as income in dollar groups.

This publication has 4 references indexed in Scilit: