Date of Award


Document Type


Degree Name

Doctor of Philosophy (PhD)


Computer Science

First Advisor

Doris L. Carver


A majority of legacy systems in use in the scientific and engineering application domains are coded in imperative languages, specifically, COBOL or FORTRAN-77. These systems have an average age of 15 years or more and have undergone years of extensive maintenance. They suffer from either poor documentation or no documentation, and antiquated coding practices and paradigms (Chik94) (Osbo90). The purpose of this research is to develop a reverse-engineering methodology to extract an object-oriented design from legacy systems written in imperative languages. This research defines a three-phase methodology that inputs source code and outputs an object-oriented design. The three phases of the methodology include: Object Extraction, Class Abstraction, and Formation of the Inheritance Hierarchy. Additionally, there is a pre-processing phase that involves code structuring, alias resolution, and resolution of the COMMON block. Object Extraction is divided into two stages: Attribute Identification and Method Identification. The output of phase one is a set of candidate objects that will serve as input for phase two, Class Abstraction. The Class Abstraction phase uses clustering techniques to form classes and define the concept of identical objects. The output of phase two is a set of classes that will serve as input to the third phase, Formation of the Inheritance Hierarchy. The Formation of the Inheritance Hierarchy phase defines a similarity measure which determines class similarity and further refines the clustering performed in phase two, Class Abstraction. The result of the methodology is an object-oriented design including hierarchy diagrams and interaction diagrams. Additionally, the results of applying the methodology in two case studies are presented. The research has resulted in the development of a unique methodology to extract object-oriented designs from imperative legacy systems. The benefits of using the methodology include: the ability to capture system functionality which may not be apparent due to poor system structure, and the reduction of future maintenance costs of the system as a direct effect of accurate system documentation and updated programming technologies.