Linguistic Constraints on Statistical Computations

Abstract
Speech is produced mainly in continuous streams containing several words. Listeners can use the transitional probability (TP) between adjacent and non-adjacent syllables to segment “words” from a continuous stream of artificial speech, much as they use TPs to organize a variety of perceptual continua. It is thus possible that a general-purpose statistical device exploits any speech unit to achieve segmentation of speech streams. Alternatively, language may limit what representations are open to statistical investigation according to their specific linguistic role. In this article, we focus on vowels and consonants in continuous speech. We hypothesized that vowels and consonants in words carry different kinds of information, the latter being more tied to word identification and the former to grammar. We thus predicted that in a word identification task involving continuous speech, learners would track TPs among consonants, but not among vowels. Our results show a preferential role for consonants in word identification.