Degree

Doctor of Philosophy (PhD)

Department

Computer Science

Document Type

Dissertation

Abstract

Malware authors currently lead in the lopsided race against security professionals. Millions of malware samples are recorded daily with little or no information about their activity or behavior. In 2019, recorded malware passed the billion mark, according to a report from AV-Test from July 26, 2020. That figure has risen to over 1.338 billion, according to a report from AV-Test from March 21, 2022. VirusTotal, an online malware scanner, reported over 2.1 million distinct new malware samples for seven days and over 2.8 million submissions on March 15, 2022 alone. Samples for PE(portable executable) exe, Unknown, XML, Android, and PE DLLs make up the top five file types during this period. As the arms race between malware authors and security professionals continues, we must have better methods for detecting malware and gaining better insight into malware behaviors. There is, however, a critical need for ground truth data for both memory forensics investigations and to support new research in the area. For investigators, ground truth is essential in distinguishing "normal" from "malicious." For researchers, memory forensics frameworks must carefully model essential data structures and algorithms, which are both difficult and frequently dependent on specific versions of operating systems and applications. Ground truth provides critical data to support testing and verification. Currently, no large-scale repositories provide "known clean" memory captures for investigators to compare against those from potentially infected systems nor for developers to confirm their tools work correctly. Therefore, developing a large-scale, freely available repository of memory captures is crucial. MemForC is an open-source framework of techniques designed to create a corpus of memory captures from the successful execution of malware in Windows, Linux, and MacOS systems. MemForC is designed using best practices to create a dynamic analysis system and leverages existing memory forensic tools. This repository will provide ground truth for investigators, allow malware research to proceed quickly, be reproducible and verifiable, and enhance education and training to meet the demand for skilled memory forensics professionals. Our corpus will be freely available to the forensics community.

Date

4-5-2024

Committee Chair

Richard III, Golden G.

Available for download on Monday, April 05, 2027

Share

COinS