REPRESENTATION AND CONTROL STRATEGIES FOR LARGE KNOWLEDGE DOMAINS: An Application to NLP

Abstract
The design issues encountered during the development of a natural language processor (NLP) for the Italian language are described. The focus is on strategic aspects, namely representation and control, and their implementation with first-order logic. The complexity and the size of the knowledge domain (press agency releases on finance and economics) do not present severe restrictions in the sentence structure; hence a considerable design effort for data structures and control algorithms was required. Logic proved to be an important tool for implementing in a modular and efficient way the knowledge sources along with the programs that derive the morphologic, syntactic, and semantic features of sentences. As for the data structures, we found a considerable advantage in separating linguistic knowledge in three sources: morphologic, syntactic, and semantic. This resulted in a clear and systematic representation scheme and reduced the complexity of the parsing system.

This publication has 4 references indexed in Scilit: