Dimensional Reduction of Word-Frequency Data as a Substitute for Intersubjective Content Analysis

Abstract
This paper presents a method for using dimensional reduction in the analysis of political content. We draw inspiration from latent semantic analysis (LSA) theory, which posits that factor analysis can successfully model human language. We suggest that the factor analysis of word frequencies generated from any political text—for example, open-ended survey responses—provides adequate content analysis categories and can substitute for more commonly practiced techniques. The method proceeds in three steps: data preparation, exploratory factor analyses, and hypothesis testing. This method may produce other benefits by allowing the data to speak more clearly in the development of coding dictionaries while avoiding the problems of inferential circularity common in other data-driven approaches. We demonstrate the method using responses collected in the execution of an experimental design dealing with the topic of partial-birth abortion and assess the demonstration by presenting a human coding of the same material.