Generating robust parsers using island grammars
- 13 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 10951350,p. 13-22
- https://doi.org/10.1109/wcre.2001.957806
Abstract
Source model extraction, the automated extraction of information from system artifacts, is a common phase in reverse engineering tools. One of the major challenges of this phase is creating extractors that can deal with irregularities in the artifacts that are typical for the reverse engineering domain (for example, syntactic errors, incomplete source code, language dialects and embedded languages). The paper proposes a solution in the form of island grammars, a special kind of grammar that combines the detailed specification possibilities of grammars with the liberal behavior of lexical approaches. We show how island grammars can be used to generate robust parsers that combine the accuracy of syntactical analysis with the speed, flexibility and tolerance usually only found in lexical analysis. We conclude with a discussion of the development of MANGROVE, a generator for source model extractors based on island grammars and describe its application to a number of case studies.Keywords
This publication has 21 references indexed in Scilit:
- Separating parsing and analysis in reverse engineering toolsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Fast, flexible syntactic pattern matching and processingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- GXL: toward a standard exchange formatPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- GENOA—a customizable, front-end-retargetable source code analysis frameworkACM Transactions on Software Engineering and Methodology, 1999
- Building documentation generatorsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- Generation of software renovation factories from compilersPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- TlexSoftware: Practice and Experience, 1991
- Substring parsing for arbitrary context-free grammarsACM SIGPLAN Notices, 1991
- The syntax definition formalism SDF—reference manual—ACM SIGPLAN Notices, 1989
- Island parsing and bidirectional chartsPublished by Association for Computational Linguistics (ACL) ,1988