A practical stemming algorithm for online search assistance
- 1 April 1983
- journal article
- Published by Emerald Publishing in Online Review
- Vol. 7 (4) , 301-318
- https://doi.org/10.1108/eb024132
Abstract
Word truncation is a familiar technique employed by online searchers in order to increase recall in free text retrieval. The use of truncation, however, can be a mixed blessing since many words starting with the same root are not semantically or logically related. Consequently, online searchers often select words to be OR‐ed together from an alphabetic display of neighbouring terms in the inverted file in order to assure precision in the search. Automatic stemming algorithms typically function in a manner analogous to word truncation, with the added risk of the word roots being incorrectly identified by the algorithm. This paper describes a two‐phase stemming algorithm that consists of the identification of the word root and the automatic selection of ‘well‐formed’ morphological word variants from the actual inverted file entries that start with the same word root. The algorithm has been successfully used in an end‐user interface to NLM's Catline book catalog file.Keywords
This publication has 2 references indexed in Scilit:
- An evaluation of some conflation algorithms for information retrievalJournal of Information Science, 1981
- Word segmentation by letter successor varietiesInformation Storage and Retrieval, 1974