Lightweight lexical source model extraction
- 1 July 1996
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Software Engineering and Methodology
- Vol. 5 (3) , 262-292
- https://doi.org/10.1145/234426.234441
Abstract
Software engineers maintaining an existing software system often depend on the mechanized extraction of information from system artifacts. Some useful kinds of information—source models—are well known: call graphs, file dependences, etc. Predicting every kind of source model that a software engineer may need is impossible. We have developed a lightweight approach for generating flexible and tolerant source model extractors from lexical specifications. The approach is lightweight in that the specifications are relatively small and easy to write. It is flexible in that there are few constraints on the kinds of artifacts from which source models are extracted (e.g., we can extract from source code, structured data files, documentation, etc.). It is tolerant in that there are few constraints on the condition of the artifacts. For example, we can extract from source that cannot necessarily be compiled. Our approach extended the kinds of source models that can be easily produced from lexical information while avoiding the constraints and brittleness of most parser-based approaches. We have developed tools to support this approach and applied the tools to the extraction of a number of different source models (file dependences, event interactions, call graphs) from a variety of system artifacts (C, C++, CLOS, Eiffel. TCL, structured data). We discuss our approach and describe its application to extract source models not available using existing systems; for example, we compute the implicitly-invokes relation over Field tools. We compare and contrast our approach to the conventional lexical and syntactic approaches of generating source models.Keywords
This publication has 14 references indexed in Scilit:
- A*: a language for implementing language processorsIEEE Transactions on Software Engineering, 1995
- Managing design trade-offs for a program understanding and transformation toolJournal of Systems and Software, 1995
- TXL: A rapid prototyping system for programming language dialectsComputer Languages, 1991
- TlexSoftware: Practice and Experience, 1991
- LaSSIECommunications of the ACM, 1991
- Formalizing design spaces: Implicit invocation mechanismsPublished by Springer Nature ,1991
- The C information abstraction systemIEEE Transactions on Software Engineering, 1990
- The program dependence graph and its use in optimizationACM Transactions on Programming Languages and Systems, 1987
- Queries and Views of Programs Using a Relational Database SystemPublished by Defense Technical Information Center (DTIC) ,1983
- Awk — a pattern scanning and processing languageSoftware: Practice and Experience, 1979