A comprehensive evaluation of capture-recapture models for estimating software defect content
- 1 June 2000
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Software Engineering
- Vol. 26 (6) , 518-540
- https://doi.org/10.1109/32.852741
Abstract
An important requirement to control the inspection of software artifacts is to be able to decide, based on more objective information, whether the inspection can stop or whether it should continue to achieve a suitable level of artifact quality. A prediction of the number of remaining defects in an inspected artifact can be used for decision making. Several studies in software engineering have considered capture-recapture models, originally proposed by biologists to estimate animal populations, to make a prediction. However, few studies compare the actual number of remaining defects to the one predicted by a capture-recapture model on real software engineering artifacts. Thus, there is little work looking at the robustness of capture-recapture models under realistic software engineering conditions, where it is expected that some of their assumptions will be violated. Simulations have been performed, but no definite conclusions can be drawn regarding the degree of accuracy of such models under realistic inspection conditions and the factors affecting this accuracy. Furthermore, the existing studies focused on a subset of the existing capture-recapture models. Thus, a more exhaustive comparison is still missing. In this study, we focus on traditional inspections and estimate, based on actual inspections data, the degree of accuracy of relevant, state-of-the-art capture-recapture models as they have been proposed in biology and for which statistical estimators exist. In order to assess their robustness, we look at the impact of the number of inspectors and the number of actual defects on the estimators' accuracy based on actual inspection data. Our results show that models are strongly affected by the number of inspectors and, therefore, one must consider this factor before using capture-recapture models. When the number of inspectors is too small, no model is sufficiently accurate and underestimation may be substantial. In addition, some models perform better than others in a large number of conditions and plausible reasons are discussed. Based on our analyses, we recommend using a model taking into account that defects have different probabilities of being detected and the corresponding Jackknife Estimator. Furthermore, we attempt to calibrate the prediction models based on their relative error, as previously computed on other inspections. Although intuitive and straightforward, we identified theoretical limitations to this approach which were then confirmed by the data.Keywords
This publication has 27 references indexed in Scilit:
- The application of subjective estimates of effectiveness to controlling software inspectionsJournal of Systems and Software, 2000
- On the statistical analysis of the number of errors remaining in a software design document after inspectionIEEE Transactions on Software Engineering, 1997
- The empirical investigation of Perspective-Based ReadingEmpirical Software Engineering, 1996
- Assessing software designs using capture-recapture methodsIEEE Transactions on Software Engineering, 1993
- Lessons from three years of inspection data (software development)IEEE Software, 1993
- Experience with Fagan's inspection methodSoftware: Practice and Experience, 1992
- Estimating software fault content before codingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992
- A Two-Person Inspection Method to Improve Prog ramming ProductivityIEEE Transactions on Software Engineering, 1989
- Software inspections: an effective verification processIEEE Software, 1989
- Design and code inspections to reduce errors in program developmentIBM Systems Journal, 1976