Pattern matching for design concept localization
- 19 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
The effective synergy of a number of different techniques is the key to the successful development of an efficient reverse engineering environment. Compiler technology, pattern matching techniques, visualization tools, and software repositories play an important role for the identification of procedural, data, and abstract-data-type related concepts in the source code. This paper describes a number of techniques used for the development of a distributed reverse engineering environments. Design recovery is investigated through code-to-code and abstract-descriptions-to-code pattern matching techniques used to locate code that may implement a particular plan or algorithm. The code-to-code matching uses dynamic programming techniques to locate similar code fragments and is targeted for large software systems (1MLOC). Patterns are specified either as source code or as a sequence of abstract statements written in an concept language developed for this purpose. Markov models are used to compute similarity measures between an abstract description and or code fragment in terms of the probability that a given abstract statement can generate a given code fragment. The abstract-description-to-code matcher is under implementation and early experiments show it is a promising technique.Keywords
This publication has 10 references indexed in Scilit:
- Concept recognition-based program transformationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A framework for source code search using program patternsIEEE Transactions on Software Engineering, 1994
- Program understanding and the concept assignment problemCommunications of the ACM, 1994
- Domain-retargetable reverse engineering. II. Personalized user interfacesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1994
- Localization of design concepts in legacy systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1994
- Investigating reverse engineering technologies for the CAS program understanding projectIBM Systems Journal, 1994
- Dotplot: A Program for Exploring Self-Similarity in Millions of Lines of Text and CodeJournal of Computational and Graphical Statistics, 1993
- A cache-based natural language model for speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1990
- Recognizing a program's design: a graph-parsing approachIEEE Software, 1990
- Error bounds for convolutional codes and an asymptotically optimum decoding algorithmIEEE Transactions on Information Theory, 1967