Identification of data cohesive subsystems using data mining techniques

Abstract
The activity of reengineering and maintaining large legacy systems involves the use of design recovery techniques to produce abstractions that facilitate the understanding of the system. We present an approach to design recovery based on data mining. This approach derives from the observation that data mining can discover unsuspected non-trivial relationships among elements in large databases. This observation suggests that data mining can be used to elicit new knowledge about the design of a subject system and that it can be applied to large legacy systems. We describe the ISA methodology which uses data mining to identify data cohesive subsystems. We were able to decompose COBOL systems into subsystems by using this approach. Our experience shows that data mining can identify data cohesive subsystems without any previous knowledge of the subject system. Furthermore, data mining can produce meaningful results regardless of system size making this approach especially appropriate to the analysis of large undocumented systems.

This publication has 17 references indexed in Scilit: