Information extraction: beyond document retrieval
- 1 March 1998
- journal article
- Published by Emerald Publishing in Journal of Documentation
- Vol. 54 (1) , 70-105
- https://doi.org/10.1108/eum0000000007162
Abstract
In this paper we give a synoptic view of the growth of the text processing technology of information extraction (IE) whose function is to extract information about a pre-specified set of entities, relations or events from natural language texts and to record this information in structured representations called templates. Here we describe the nature of the IE task, review the history of the area from its origins in AI work in the 1960s and 70s till the present, discuss the techniques being used to carry out the task, describe application areas where IE systems are or are about to be at work, and conclude with a discussion of the challenges facing the area. What emerges is a picture of an exciting new text processing technology with a host of new applications, both on its own and in conjunction with other technologies, such as information retrieval, machine translation and data mining.Keywords
This publication has 28 references indexed in Scilit:
- Evaluation of an algorithm for the recognition and classification of proper namesPublished by Association for Computational Linguistics (ACL) ,1996
- Information extractionCommunications of the ACM, 1996
- POETIC: A system for gathering and disseminating traffic informationNatural Language Engineering, 1995
- Inductive text classification for medical applicationsJournal of Experimental & Theoretical Artificial Intelligence, 1995
- Interpretation as abductionArtificial Intelligence, 1993
- Automatic extraction of facts from press releases to generate news storiesPublished by Association for Computational Linguistics (ACL) ,1992
- A simple rule-based part of speech taggerPublished by Association for Computational Linguistics (ACL) ,1992
- Introduction to WordNet: An On-line Lexical Database*International Journal of Lexicography, 1990
- SCISOR: extracting information from on-line newsCommunications of the ACM, 1990
- Automatic representation of the semantic relationships corresponding to a French surface expressionPublished by Association for Computational Linguistics (ACL) ,1983