Hierarchical structures induce long-range dynamical correlations in written texts
- 23 May 2006
- journal article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 103 (21) , 7956-7961
- https://doi.org/10.1073/pnas.0510673103
Abstract
Thoughts and ideas are multidimensional and often concurrent, yet they can be expressed surprisingly well sequentially by the translation into language. This reduction of dimensions occurs naturally but requires memory and necessitates the existence of correlations, e.g., in written text. However, correlations in word appearance decay quickly, while previous observations of long-range correlations using random walk approaches yield little insight on memory or on semantic context. Instead, we study combinations of words that a reader is exposed to within a “window of attention,” spanning about 100 words. We define a vector space of such word combinations by looking at words that co-occur within the window of attention, and analyze its structure. Singular value decomposition of the co-occurrence matrix identifies a basis whose vectors correspond to specific topics, or “concepts” that are relevant to the text. As the reader follows a text, the “vector of attention” traces out a trajectory of directions in this “concept space.” We find that memory of the direction is retained over long times, forming power-law correlations. The appearance of power laws hints at the existence of an underlying hierarchical network. Indeed, imposing a hierarchy similar to that defined by volumes, chapters, paragraphs, etc. succeeds in creating correlations in a surrogate random text that are identical to those of the original text. We conclude that hierarchical structures in text serve to create long-range correlations, and use the reader’s memory in reenacting some of the multidimensionality of the thoughts being expressed.Keywords
This publication has 53 references indexed in Scilit:
- Characterization of genome-wide p53-binding sites upon stress responseNucleic Acids Research, 2008
- Epigenetic Inactivation of theHOXAGene Cluster in Breast CancerCancer Research, 2006
- 5-Aza-2′-deoxycytidine-mediated reductions in G9A histone methyltransferase and histone H3 K9 di-methylation levels are linked to tumor suppressor gene reactivationOncogene, 2006
- A Genomic Map of p53 Binding Sites Identifies Novel p53 Targets Involved in an Apoptotic NetworkCancer Research, 2005
- Direct p53 Transcriptional Repression: In Vivo Analysis of CCAAT-Containing G2/M PromotersMolecular and Cellular Biology, 2005
- Transcription Factor Interactions and Chromatin Modifications Associated with p53-Mediated, Developmental Repression of the Alpha-Fetoprotein GeneMolecular and Cellular Biology, 2005
- Modulation of Gene Expression by Tumor-Derived p53 MutantsCancer Research, 2004
- Induction of p53-dependent Activation of the Human Proliferating Cell Nuclear Antigen Gene in Chromatin by Ionizing RadiationPublished by Elsevier ,2003
- Identification of a family of human F-box proteinsCurrent Biology, 1999
- Crystal Structure of a p53 Tumor Suppressor-DNA Complex: Understanding Tumorigenic MutationsScience, 1994