Probabilistic question answering on the web
- 7 May 2002
- proceedings article
- Published by Association for Computing Machinery (ACM)
- Vol. 56 (6) , 408-419
- https://doi.org/10.1145/511446.511500
Abstract
Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this paper we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR) using proximity and question type features achieves a total reciprocal document rank of .20 on the TREC 8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.Keywords
This publication has 9 references indexed in Scilit:
- Getting answers to natural language questions on the WebJournal of the American Society for Information Science and Technology, 2002
- Mining the web for answers to natural language questionsPublished by Association for Computing Machinery (ACM) ,2001
- Learning search engine specific query transformations for question answeringPublished by Association for Computing Machinery (ACM) ,2001
- Scaling question answering to the WebPublished by Association for Computing Machinery (ACM) ,2001
- Document centered approach to text normalizationPublished by Association for Computing Machinery (ACM) ,2000
- Ranking suspected answers to natural language questions using predictive annotationPublished by Association for Computational Linguistics (ACL) ,2000
- Answer extractionPublished by Association for Computational Linguistics (ACL) ,2000
- The eighth text REtrieval conference (TREC-8)Published by National Institute of Standards and Technology (NIST) ,2000
- MURAXPublished by Association for Computing Machinery (ACM) ,1993