Identifier
etd-07092008-211059
Degree
Master of Science in Computer Science (MSCS)
Department
Computer Science
Document Type
Thesis
Abstract
The increasing computational and data requirements of scientific applications have made the usage of large clustered systems as well as distributed resources inevitable. Although executing large applications in these environments brings increased performance, the automation of the process becomes more and more challenging. The use of complex workflow management systems has been a viable solution for this automation process. In this thesis, we study a broad range of workflow management tools and compare their capabilities especially in terms of dynamic and conditional structures they support, which are crucial for the automation of complex applications. We then apply some of these tools to two real-life scientific applications: i) simulation of DNA folding, and ii) reservoir uncertainty analysis. Our implementation is based on Pegasus workflow planning tool, DAGMan workflow execution system, Condor-G computational scheduler, and Stork data scheduler. The designed abstract workflows are converted to concrete workflows using Pegasus where jobs are matched to resources; DAGMan makes sure these jobs execute reliably and in the correct order on the remote resources; Condor-G performs the scheduling for the computational tasks and Stork optimizes the data movement between different components. Integrated solution with these tools allows automation of large scale applications, as well as providing complete reliability and efficiency in executing complex workflows. We have also developed a new site selection mechanism on top of these systems, which can choose the most available computing resources for the submission of the tasks. The details of our design and implementation, as well as experimental results are presented.
Date
2008
Document Availability at the Time of Submission
Release the entire work immediately for access worldwide.
Recommended Citation
Bahsi, Emir Mahmut, "Dynamic workflow management for large scale scientific applications" (2008). LSU Master's Theses. 1576.
https://repository.lsu.edu/gradschool_theses/1576
Committee Chair
Kosar, Tevfik
DOI
10.31390/gradschool_theses.1576