Lexical ambiguity and information retrieval

1 April 1992

journal article
Published by Association for Computing Machinery (ACM) in ACM Transactions on Information Systems

Vol. 10 (2) , 115-141
https://doi.org/10.1145/146802.146810

Abstract

Lexical ambiguity is a pervasive problem in natural language processing. However, little quantitative information is available about the extent of the problem or about the impact that it has on information retrieval systems. We report on an analysis of lexical ambiguity in information retrieval test collections and on experiments to determine the utility of word meanings for separating relevant from nonrelevant documents. The experiments show that there is considerable ambiguity even in a specialized database. Word senses provide a significant separation between relevant and nonrelevant documents, but several factors contribute to determining whether disambiguation will make an improvement in performance. For example, resolving lexical ambiguity was found to have little impact on retrieval effectiveness for documents that have many words in common with the query. Other uses of word sense disambiguation in an information retrieval context are discussed.

Keywords

This publication has 7 references indexed in Scilit:

Models for retrieval with probabilistic indexing
Information Processing & Management, 1989
An experiment in computational discrimination of English word senses
IBM Journal of Research and Development, 1988
Naive Semantics for Natural Language Understanding
Published by Springer Nature ,1988
AUTOMATIC SEARCH TERM VARIANT GENERATION
Journal of Documentation, 1984
Use of word government in resolving syntactic and semantic ambiguities
Information Storage and Retrieval, 1973
Learning to disambiguate
Information Storage and Retrieval, 1973
The Meaning-Frequency Relationship of Words
The Journal of General Psychology, 1945