Master of Science in Petroleum Engineering (MSPE)


Petroleum Engineering

Document Type

Access to Thesis Restricted to LSU Campus


Data driven modeling (DDM) techniques implement machine learning (ML) to analyze data and discover connections without explicit knowledge of the physical behavior. Recent improvements in technology and computational power have increased interest in the application of DDM in petroleum industry. Recovery process evaluation using numerical reservoir simulators are always costly, time consuming, computational intensive with many assumptions and uncertainty involved. In this thesis, DDM have been adopted as an alternative tool to predict production performance under waterflooding which is one of the most important techniques for improving oil recovery. A synthetic waterflooding dataset including production profile, operational parameters, reservoir properties and well locations is constructed using the numerical reservoir simulator. Exploratory data analysis provides several insights into the non-intuitive factors in building the reservoir model. K-means clustering analysis is performed to identify internal groupings among producers. Artificial neural network (ANN) and support vector machine (SVM), particularly support vector regression (SVR), are used to decipher the nonlinear relationships between input attributes and waterflooding production. The trained models are subsequently used to predict cumulative oil and watercut on the unseen samples. Clustering analysis reveal that distance to the free water level has a dominant effect. The cluster that has the smallest average distance to FWL tends to have the highest watercut and lowest cumulative oil compared with the simulation results. Clustering results also indicates that the clustering assignment is controlled by the interplay among input attributes characterizing reservoir properties and relative well locations. Good agreements between predicted outputs from models and simulation targets present the satisfactory generalization performance and predictive capabilities of ANN and SVR methods. ANN model with one output provides the most accurate prediction result on the test data. ANN model with two outputs reveals the robustness of this approach. SVR models provide similar but slightly worse forecast than ANN models. No previous work studied on the application of SVM on waterflooding performance prediction. Results in this study verify its acceptability and applicability. Proposed methodologies in this thesis study can be utilized as a surrogate or complementary model to analyze and predict recovery process in other reservoirs fast and efficiently.



Document Availability at the Time of Submission

Student has submitted appropriate documentation to restrict access to LSU for 365 days after which the document will be released for worldwide access.

Committee Chair

Hughes, Richard