New York, Dayton (Ohio), and the Raw Frequency Fallacy

Abstract
There is a long-standing tradition in Chomskyan generative grammar of rejecting the relevance of corpus studies. A variety of arguments are put forth to justify this rejection, most importantly, that corpora are necessarily “finite and somewhat accidental” while the set of grammatical utterances is “presumably infinite” (Chomsky 1957: 15), and that, therefore, “probabilistic considerations have nothing to do with grammar” (Chomsky 1964[1962]: 215, n. 1; cf. also Chomsky 1957: 17). Chomsky is frequently reported as backing up this claim with the observation that the sentence I live in New York is fundamentally more likely than I live in Dayton, Ohio purely by virtue of the fact that there are more people likely to say the former than the latter (McEnery and Wilson 2001: 10). As always, it is difficult to decide whether Chomsky seriously offers this example in support of his position. Not that it really matters: Chomsky’s contempt for ‒ and his ignorance of ‒ quantitative issues is of no concern to modern corpus linguistics. Chomsky’s irredeemably anti-empirical views are firmly rooted in his anti-empiricist philosophy, and no amount of quantitatively sophisticated corpus-based argumentation will ever change his mind.

This publication has 0 references indexed in Scilit: