A bootstrapping method for extracting bilingual text pairs

Open Access

1 January 2000

conference paper
Published by Association for Computational Linguistics (ACL)

p. 1066-1070
https://doi.org/10.3115/992730.992806

Abstract

This paper proposes a method for extracting bilingual text pairs from a comparable corpus. The basic idea of the method is to apply bootstrapping to an existing corpus-based cross-language information retrieval (CLIR) approach. We conducted preliminary tests with English and Japanese bilingual corpora. The bootstrapping method led to much better results for the task of extracting translation pairs compared with a corpus-based CLIR method without boot-strapping, and the extracted translation pairs could be useful training data for improving results of the corpus-based CLIR method.

Keywords

COMPARABLE CORPUS
BASIC IDEA
BETTER RESULT
JAPANESE BILINGUAL CORPUS
BOOTSTRAPPING METHOD
EXISTING CORPUS-BASED CROSS-LANGUAGE INFORMATION
PRELIMINARY TEST
BILINGUAL TEXT PAIR
TRANSLATION PAIR
CORPUS-BASED CLIR METHOD

This publication has 0 references indexed in Scilit: