Information Extraction

Open Access

1 November 2005

journal article
Published by Association for Computing Machinery (ACM) in Queue

Vol. 3 (9) , 48-57
https://doi.org/10.1145/1105664.1105679

Abstract

In 2001 the U.S. Department of Labor was tasked with building a Web site that would help people find continuing education opportunities at community colleges, universities, and organizations across the country. The department wanted its Web site to support fielded Boolean searches over locations, dates, times, prerequisites, instructors, topic areas, and course descriptions. Ultimately it was also interested in mining its new database for patterns and educational trends. This was a major data-integration project, aiming to automatically gather detailed, structured information from tens of thousands of individual institutions every three months.

Keywords

This publication has 8 references indexed in Scilit:

Group and topic discovery from relations and text
Published by Association for Computing Machinery (ACM) ,2005
A high-performance semi-supervised learning method for text chunking
Published by Association for Computational Linguistics (ACL) ,2005
Dependency tree kernels for relation extraction
Published by Association for Computational Linguistics (ACL) ,2004
Named entity recognition with character-level models
Published by Association for Computational Linguistics (ACL) ,2003
Bursty and hierarchical structure in streams
Published by Association for Computing Machinery (ACM) ,2002
Probabilistic reasoning for entity & relation recognition
Published by Association for Computational Linguistics (ACL) ,2002
Digital libraries and autonomous citation indexing
Computer, 1999
Nymble
Published by Association for Computational Linguistics (ACL) ,1997