Comparison of clustering algorithms in the context of software evolution
- 1 January 2005
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 10636773,p. 525-535
- https://doi.org/10.1109/icsm.2005.31
Abstract
To aid software analysis and maintenance tasks, a number of software clustering algorithms have been proposed to automatically partition a software system into meaningful subsystems or clusters. However, it is unknown whether these algorithms produce similar meaningful clusterings for similar versions of a real-life software system under continual change and growth. This paper describes a comparative study of six software clustering algorithms. We applied each of the algorithms to subsequent versions from five large open source systems. We conducted comparisons based on three criteria respectively: stability (Does the clustering change only modestly as the system undergoes modest updating?), authoritative-ness (Does the clustering reasonably approximate the structure an authority provides?) and extremity of cluster distribution (Does the clustering avoid huge clusters and many very small clusters?). Experimental results indicate that the studied algorithms exhibit distinct characteristics. For example, the clusterings from the most stable algorithm bear little similarity to the implemented system structure, while the clusterings from the least stable algorithm has the best cluster distribution. Based on obtained results, we claim that current automatic clustering algorithms need significant improvement to provide continual support for large software projects.Keywords
This publication has 19 references indexed in Scilit:
- MoJo: a distance metric for software clusteringsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Software agingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- An intelligent tool for re-engineering software modularityPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Comparing the decompositions produced by software clustering algorithms using similarity measurementsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Evolution patterns of open-source software systems and communitiesPublished by Association for Computing Machinery (ACM) ,2002
- Bunch: a clustering tool for the recovery and maintenance of software system structuresPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- A unified framework for expressing software subsystem classification techniquesJournal of Systems and Software, 1997
- A reverse‐engineering approach to subsystem structure identificationJournal of Software Maintenance: Research and Practice, 1993
- Software architecture analysisPublished by Association for Computing Machinery (ACM) ,1991
- A Comparison of the Stability Characteristics of Some Graph Theoretic Clustering MethodsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1981