Representing text chunks

Open Access

1 January 1999

proceedings article
Published by Association for Computational Linguistics (ACL)

p. 173-179
https://doi.org/10.3115/977035.977059

Abstract

Dividing sentences in chunks of words is a useful preprocessing step for parsing, information extraction and information retrieval. (Ramshaw and Marcus, 1995) have introduced a "convenient" data representation for chunking by converting it to a tagging task. In this paper we will examine seven different data representations for the problem of recognizing noun phrase chunks. We will show that the the data representation choice has a minor influence on chunking performance. However, equipped with the most suitable data representation, our memory-based learning chunker was able to improve the best published chunking results for a standard data set.

Keywords

DIVIDING SENTENCE
INFORMATION RETRIEVAL
STANDARD DATA
DATA REPRESENTATION CHOICE
CHUNKING PERFORMANCE
DIFFERENT DATA REPRESENTATION
SUITABLE DATA REPRESENTATION
REPRESENTING TEXT CHUNK
MEMORY-BASED LEARNING CHUNKER
DATA REPRESENTATION
INFORMATION EXTRACTION
NOUN PHRASE

All Related Versions

Version 1, 1999-07-06, ArXiv (Unconfirmed version)

This publication has 0 references indexed in Scilit: