An empirical study of static call graph extractors
- 1 April 1998
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Software Engineering and Methodology
- Vol. 7 (2) , 158-191
- https://doi.org/10.1145/279310.279314
Abstract
Informally, a call graph represents calls between entities in a given program. The call graphs that compilers compute to determine the applicability of an optimization must typically be conservative: a call may be omitted only if it can never occur in any execution of the program. Numerous software engineering tools also extract call graphs with the expectation that they will help software engineers increase their understanding of a program. The requirements placed on software engineering tools that compute call graphs are typically more relaxed than for compilers. For example, some false negatives—calls that can in fact take place in some execution of the program, but which are omitted from the call graph—may be acceptable, depending on the understanding task at hand. In this article, we empirically show a consequence of this spectrum of requirements by comparing the C call graphs extracted from three software systems (mapmaker, mosaic, and gcc) by nine tools (cflow, cawk, CIA, Field, GCT, Imagix, LSME, Mawk, and Rigiparse). A quantitative analysis of the call graphs extracted for each system shows considerable variation, a result that is counterintuitive to many experienced software engineers. A qualitative analysis of these results reveals a number of reasons for this variation: differing treatments of macros, function pointers, input formats, etc. The fundamental problem is not that variances among the graphs extracted by different tools exist, but that software engineers have little sense of the dimensions of approximation in any particular call graph. In this article, we describe and discuss the study, sketch a design space for static call graph extractors, and discuss the impact of our study on practitioners, tool developers, and researchers. Although this article considers only one kind of information, call graphs, many of the observations also apply to static extractors of other kinds of information, such as inheritance structures, file dependences, and references to global variables.Keywords
This publication has 10 references indexed in Scilit:
- Lightweight lexical source model extractionACM Transactions on Software Engineering and Methodology, 1996
- The Field Programming Environment: A Friendly Integrated Environment for Learning and DevelopmentPublished by Springer Nature ,1995
- How accurate is scientific software?IEEE Transactions on Software Engineering, 1994
- Efficient call graph analysisACM Letters on Programming Languages and Systems, 1992
- Constructing the procedure call multigraphIEEE Transactions on Software Engineering, 1990
- The C information abstraction systemIEEE Transactions on Software Engineering, 1990
- Interprocedural optimization: eliminating unnecessary recompilationACM SIGPLAN Notices, 1986
- Constructing the Call Graph of a ProgramIEEE Transactions on Software Engineering, 1979
- Awk — a pattern scanning and processing languageSoftware: Practice and Experience, 1979
- A practical interprocedural data flow analysis algorithmCommunications of the ACM, 1978