Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words
Open Access
- 11 November 2009
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 4 (11) , e7678
- https://doi.org/10.1371/journal.pone.0007678
Abstract
Zipf's discovery that word frequency distributions obey a power law established parallels between biological and physical processes, and language, laying the groundwork for a complex systems perspective on human communication. More recent research has also identified scaling regularities in the dynamics underlying the successive occurrences of events, suggesting the possibility of similar findings for language as well. By considering frequent words in USENET discussion groups and in disparate databases where the language has different levels of formality, here we show that the distributions of distances between successive occurrences of the same word display bursty deviations from a Poisson process and are well characterized by a stretched exponential (Weibull) scaling. The extent of this deviation depends strongly on semantic type – a measure of the logicality of each word – and less strongly on frequency. We develop a generative model of this behavior that fully determines the dynamics of word usage. Recurrence patterns of words are well described by a stretched exponential distribution of recurrence times, an empirical scaling that cannot be anticipated from Zipf's law. Because the use of words provides a uniquely precise and powerful lens on human thought and activity, our findings also have implications for other overt manifestations of collective human dynamics.Keywords
All Related Versions
This publication has 43 references indexed in Scilit:
- Modeling Statistical Properties of Written TextPLOS ONE, 2009
- A Poissonian explanation for heavy tails in e-mail communicationProceedings of the National Academy of Sciences, 2008
- Novelty and collective attentionProceedings of the National Academy of Sciences, 2007
- Language processing in the natural worldPhilosophical Transactions Of The Royal Society B-Biological Sciences, 2007
- Novelty and Collective AttentionSSRN Electronic Journal, 2007
- A twenty-first century scienceNature, 2007
- Hierarchical structures induce long-range dynamical correlations in written textsProceedings of the National Academy of Sciences, 2006
- Power laws, Pareto distributions and Zipf's lawContemporary Physics, 2005
- The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?Science, 2002
- Stretched exponential distributions in nature and economy: “fat tails” with characteristic scalesZeitschrift für Physik B Condensed Matter, 1998