Off-line variable substitution for scaling points-to analysis

Abstract
Most compiler optimizations and software productivity tools rely on information about the effects of pointer dereferences in a program. The purpose of points-to analysis is to compute this information safely, and as accurately as is practical. Unfortunately, accurate points-to information is difficult to obtain for large programs, because the time and space requirements of the analysis become prohibitive. We consider the problem of scaling flow- and context-insensitive points-to analysis to large programs, perhaps containing hundreds of thousands of lines of code. Our approach is based on a variable substitution transformation, which is performed off-line, i.e., before a standard points-to analysis is performed. The general idea of variable substitution is that a set of variables in a program can be replaced by a single representative variable, thereby reducing the input size of the problem. Our main contribution is a linear-time algorithm which finds a particular variable substitution that maintains the precision of the standard analysis, and is also very effective in reducing the size of the problem. We report our experience in performing points-to analysis on large C programs, including some industrial-sized ones. Experiments show that our algorithm can reduce the cost of Andersen's points-to analysis substantially: on average, it reduced the running time by 53% and the memory cost by 59%, relative to an efficient baseline implementation of the analysis.

This publication has 13 references indexed in Scilit: