Guessing probability distributions from small samples
Preprint
- 22 March 2002
Abstract
We propose a new method for the calculation of the statistical properties, as e.g. the entropy, of unknown generators of symbolic sequences. The probability distribution p(k) of the elements k of a population can be approximated by the frequencies f(k) of a sample provided the sample is long enough so that each element k occurs many times. Our method yields an approximation if this precondition does not hold. For a given f(k) we recalculate the Zipf-ordered probability distribution by optimization of the parameters of a guessed distribution. We demonstrate that our method yields reliable results.Keywords
All Related Versions
- Version 1, 2002-03-22, ArXiv
- Published version: Journal of Statistical Physics, 80 (5-6), 1443.
This publication has 0 references indexed in Scilit: