Resolving ambiguity for cross-language retrieval

Abstract
One of the main hurdles to improved CLIR ef-fectiveness is resolving ambiguity associated with translation. Availability of resources is also a problem. First we present a technique based on co-occurrence statistics from unlinked cor-pora which can be used to reduce the ambiguity associated with phrasal and term translation. We then combine this method with other techniques for reducing ambiguity and achieve more than 90% monolingual effectiveness. Finally, we compare the co-occurrence method with parallel corpus and machine trans-lation techniques and show that good retrieval effectiveness can be achieved without complex resources.

This publication has 11 references indexed in Scilit: