Finding latent code errors via machine learning over program executions
- 28 September 2004
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 02705257,p. 480-490
- https://doi.org/10.1109/icse.2004.1317470
Abstract
This paper proposes a technique for identifying program properties that indicate errors. The technique generates machine learning models of program properties known to result from errors, and applies these models to program properties of user-written code to classify and rank properties that may lead the user to errors. Given a set of properties produced by the program analysis, the technique selects a subset of properties that are most likely to reveal an error. An implementation, the fault invariant classifier, demonstrates the efficacy of the technique. The implementation uses dynamic invariant detection to generate program properties. It uses support vector machine and decision tree learning tools to classify those properties. In our experimental evaluation, the technique increases the relevance (the concentration of fault-revealing properties) by a factor of 50 on average for the C programs, and 4.8 for the Java programs. Preliminary experience suggests that most of the fault-revealing properties do lead a programmer to an error.Keywords
This publication has 16 references indexed in Scilit:
- Reducing wasted development time via continuous testingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Tree Induction for Probability-Based RankingMachine Learning, 2003
- Improving test suites via operational abstractionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Automated support for classifying software failure reportsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Empirical evaluation of the textual differencing regression testing techniquePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Automated support for program refactoring using invariantsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Using redundancies to find errorsPublished by Association for Computing Machinery (ACM) ,2002
- An Introduction to Support Vector Machines and Other Kernel-based Learning MethodsPublished by Cambridge University Press (CUP) ,2000
- Empirical studies of a safe regression test selection techniqueIEEE Transactions on Software Engineering, 1998
- Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpointsPublished by Association for Computing Machinery (ACM) ,1977