Mining knowledge from text using information extraction
Top Cited Papers
- 1 June 2005
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGKDD Explorations Newsletter
- Vol. 7 (1) , 3-10
- https://doi.org/10.1145/1089815.1089817
Abstract
An important approach to text mining involves the use of natural-language information extraction. Information extraction (IE) distills structured data or knowledge from unstructured text by identifying references to named entities as well as stated relationships between such entities. IE systems can be used to directly extricate abstract knowledge from a text corpus, or to extract concrete data from a set of documents which can then be further analyzed with traditional data-mining techniques to discover more general patterns. We discuss methods and implemented systems for both of these approaches and summarize results on mining real text corpora of biomedical abstracts, job announcements, and product descriptions. We also discuss challenges that arise when employing current information extraction technology to discover knowledge in text.Keywords
This publication has 33 references indexed in Scilit:
- Comparative experiments on learning information extractors for proteins and their interactionsArtificial Intelligence in Medicine, 2005
- Adaptive name matching in information integrationIEEE Intelligent Systems, 2003
- Named entity recognition with a maximum entropy approachPublished by Association for Computational Linguistics (ACL) ,2003
- A simple named entity extractor using AdaBoostPublished by Association for Computational Linguistics (ACL) ,2003
- Association of genes to genetically inherited diseases using data miningNature Genetics, 2002
- Machine learning in automated text categorizationACM Computing Surveys, 2002
- The frame-based module of the SUISEKI information extraction systemIEEE Intelligent Systems and their Applications, 2002
- Can bibliographic pointers for known biological data be found automatically? Protein interactions as a case studyComparative and Functional Genomics, 2001
- 10.1162/153244304322972685Applied Physics Letters, 2000
- A tutorial on hidden Markov models and selected applications in speech recognitionProceedings of the IEEE, 1989