An effective approach to document retrieval via utilizing WordNet and recognizing phrases
Top Cited Papers
- 25 July 2004
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 266-272
- https://doi.org/10.1145/1008992.1009039
Abstract
Noun phrases in queries are identified and classified into four types: proper names, dictionary phrases, simple phrases and complex phrases. A document has a phrase if all content words in the phrase are within a window of a certain size. The window sizes for different types of phrases are different and are determined using a decision tree. Phrases are more important than individual terms. Consequently, documents in response to a query are ranked with matching phrases given a higher priority. We utilize WordNet to disambiguate word senses of query terms. Whenever the sense of a query term is determined, its synonyms, hyponyms, words from its definition and its compound words are considered for possible additions to the query. Experimental results show that our approach yields between 23% and 31% improvements over the best-known results on the TREC 9, 10 and 12 collections for short (title only) queries, without using Web data.Keywords
This publication has 17 references indexed in Scilit:
- Word sense disambiguation in information retrieval revisitedPublished by Association for Computing Machinery (ACM) ,2003
- Probabilistic models of information retrieval based on measuring the divergence from randomnessACM Transactions on Information Systems, 2002
- Word sense disambiguation with pattern learning and automatic feature selectionNatural Language Engineering, 2002
- Semantic indexing using WordNet sensesPublished by Association for Computational Linguistics (ACL) ,2000
- Query expansion using local and global document analysisPublished by Association for Computing Machinery (ACM) ,1996
- Optimization of relevance feedback weightsPublished by Association for Computing Machinery (ACM) ,1995
- PRINCIPARPublished by Association for Computational Linguistics (ACL) ,1994
- Lexical ambiguity and information retrievalACM Transactions on Information Systems, 1992
- Introduction to WordNet: An On-line Lexical Database*International Journal of Lexicography, 1990
- Relevance weighting of search termsJournal of the American Society for Information Science, 1976