The dimensionality of discourse
- 16 March 2010
- journal article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 107 (11) , 4866-4871
- https://doi.org/10.1073/pnas.0908315107
Abstract
The paragraph spaces of five text corpora, of different genres and intended audiences, in four different languages, all show the same two-scale structure, with the dimension at short distances being lower than at long distances. In all five cases the short-distance dimension is approximately eight. Control simulations with randomly permuted word instances do not exhibit a low dimensional structure. The observed topology places important constraints on the way in which authors construct prose, which may be universal.Keywords
This publication has 19 references indexed in Scilit:
- Metric character of the quantum Jensen-Shannon divergencePhysical Review A, 2008
- Quantifying incoherence in speech: An automated methodology and novel application to schizophreniaSchizophrenia Research, 2007
- Bent-Cable Regression Theory and ApplicationsJournal of the American Statistical Association, 2006
- A new metric for probability distributionsIEEE Transactions on Information Theory, 2003
- An introduction to latent semantic analysisDiscourse Processes, 1998
- Improving the retrieval of information from external sourcesBehavior Research Methods, Instruments & Computers, 1991
- Estimating fractal dimensionJournal of the Optical Society of America A, 1990
- On the numerical determination of the dimension of an attractorPublished by Springer Nature ,1985
- Detecting strange attractors in turbulencePublished by Springer Nature ,1981
- A vector space model for automatic indexingCommunications of the ACM, 1975