Abstract
Many wording choices in English sentences cannot be accounted for on semantic or syntactic grounds; they are idiosyncratic. They can be expressed in terms of co-occurrence relations among lexical items, and need to be specifically included in dictionaries. For language generation, this type of lexical knowledge is crucial as it would enhance the process of lexical selection while simplifying input structures Co-occurrence knowledge is currently not available in Compiled form, which is the main reason why it has generally been ignored in the past. We describe in this paper a co-occurrence compiler, EXTRACT that identifies co-occurrence lexical relations in large on-line corpora of English texts EXTRACT can be used as a lexicographic tool for compiling machine readable dictionaries