Combining document representations for known-item search
Top Cited Papers
- 28 July 2003
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 143-150
- https://doi.org/10.1145/860435.860463
Abstract
This paper investigates the pre-conditions for successful combination of document representations formed from structural markup for the task of known-item search. As this task is very similar to work in meta-search and data fusion, we adapt several hypotheses from those research areas and investigate them in this context. To investigate these hypotheses, we present a mixture-based language model and also examine many of the current meta-search algorithms. We find that compatible output from systems is important for successful combination of document representations. We also demonstrate that combining low performing document representations can improve performance, but not consistently. We find that the techniques best suited for this task are robust to the inclusion of poorly performing document representations. We also explore the role of variance of results across systems and its impact on the performance of fusion, with the surprising result that the correct documents have higher variance across document representations than highly ranking incorrect documents.Keywords
This publication has 12 references indexed in Scilit:
- Condorcet fusion for improved retrievalPublished by Association for Computing Machinery (ACM) ,2002
- Two-stage language models for information retrievalPublished by Association for Computing Machinery (ACM) ,2002
- The Importance of Prior Probabilities for Entry Page SearchPublished by Association for Computing Machinery (ACM) ,2002
- Analyses of multiple-evidence combinations for retrieval strategiesPublished by Association for Computing Machinery (ACM) ,2001
- A study of smoothing methods for language models applied to Ad Hoc information retrievalPublished by Association for Computing Machinery (ACM) ,2001
- Modeling score distributions for combining the outputs of search enginesPublished by Association for Computing Machinery (ACM) ,2001
- Effective site finding using link anchor informationPublished by Association for Computing Machinery (ACM) ,2001
- Models for metasearchPublished by Association for Computing Machinery (ACM) ,2001
- A language modeling approach to information retrievalPublished by Association for Computing Machinery (ACM) ,1998
- A flexible model for retrieval of SGML documentsPublished by Association for Computing Machinery (ACM) ,1998