Abstract
Reliability as measured by the extent of agreement is often a problem for complex global judgments. Empirically, the use of multiple raters improved reliability consistent with predictions from the Spearman-Brown formula. Implications for the reliability of clinical diagnosis are suggested.