Abstract
We show that former approaches in probabilistic information retrieval are based on one or two of the three concepts abstraction, inductive learning , and probabilistic assumptions , and we propose a new approach which combines all three concepts. This approach is illustrated for the case of indexing with a controlled vocabulary. For this purpose, we describe a new probabilistic model first, which is then combined with logistic regression, thus yielding a generalization of the original model. Experimental results for the pure theoretical model as well as for heuristic variants are given. Furthermore, linear and logistic regression are compared.

This publication has 8 references indexed in Scilit: