On the Dangers of Averaging Across Subjects When Using Multidimensional Scaling or the Similarity-Choice Model

Abstract
When ratings of judged similarity or frequencies of stimulus identification are averaged across subjects, the psychological structure of the data is fundamentally changed. Regardless of the structure of the individual-subject data, the averaged similarity data will likely be well fit by a standard multidimensional scaling model, and the averaged identification data will likely be well fit by the similarity-choice model. In fact, both models often provide excellent fits to averaged data, even if they fail to fit the data of each individual subject. Thus, a good fit of either model to averaged data cannot be taken as evidence that the model describes the psychological structure that characterizes individual subjects. We hypothesize that these effects are due to the increased symmetry that is a mathematical consequence of the averaging operation.