Semester of Graduation
Summer 2021
Degree
Master of Science (MS)
Department
Computer Science
Document Type
Thesis
Abstract
ABSTRACT
Traceability link recovery (TLR) is a software engineering activity that helps to ensure software quality and assists with keeping track of changes by establishing links between software artifacts that are a part of the software engineering process, such as requirements, use cases, source code, test cases, and documentation. Software requirement artifacts are typically written in natural language. An Information Retrieval process is frequently used in many software activities, including the TLR activity. Recently, Word Embedding (WE) techniques have been used in many natural language processing tasks as well as in TLR tasks. We investigate the effectiveness of WE techniques in conjunction with the ABC algorithm for automating the TLR process between requirements and source code. The ABC algorithm, which is a metaheuristic search Swarm Intelligence (SI) algorithm that simulates the behavior of honeybee swarms, is useful for solving multidimensional optimization problems. We use a modified ABC algorithm in which the initial population is generated randomly based on the document ID number within the document set boundaries. We use the algorithm to optimize the objective function and find the best links between the requirements and the source code. For our investigation we use three open source pretrained models: Word2Vec, GloVe, and FastText. We experiment with three objective functions that are optimized by the ABC algorithm to find the best possible links between the documents. Our experimentation with three datasets indicates that the three objective functions result in similar success rates. We use precision, recall, and the F1 measure to determine effectiveness for the TLR task. Our results show that the recall is higher than the precision and that the resulting F1 value does not indicate promise for combining word embedding, our three objective functions, and the modified ABC algorithm as a recommended approach for automating traceability links between requirements and source code.
Recommended Citation
Khatun, Mahfuza and Khatun, Mahfuza, "EVALUATING WORD EMBEDDING MODELS FOR TRACEABILITY" (2021). LSU Master's Theses. 5414.
https://repository.lsu.edu/gradschool_theses/5414
Committee Chair
Carver, Doris L.
DOI
10.31390/gradschool_theses.5414