Sampling from a skewed population distribution as exemplified by estimation of the creatine kinase upper reference limit.

Abstract
Creatine kinase (EC 2.7.3.2) was measured in sera from 580 females, ages 1-77 years, and 550 males, ages 1-63 years. The distribution of results for male and female groups shows pronounced skewing toward higher values. The observed distribution of results could not be described by any of six mathematical formulas for skewed distributions, an indication of the unsuitability of such formulas to transform these data for parametric analysis. The range of 97.5 percentile estimates produced by six independent samples of 100, 200, and 400 observations randomly selected from a mathematical model defined by the adult female distribution showed progressive narrowing from the 150-380 U/L interval for the samples of 100 observations to 200-265 U/L for the samples of 400 observations; no further improvement was seen when 800 observations were used. The samples of 100 and 200 observations contained extreme value points that might appear as "outliers" but were shown to be valid members of the population distribution when larger sample sizes were collected.