Doctor of Philosophy (PhD)


Plant, Environmental Management and Soil Sciences

Document Type



Initially, 46 petroleum contaminated and non-contaminated soil samples were collected and scanned using visible near-infrared diffuse reflectance spectroscopy (VisNIR DRS) at three combinations of moisture content and pretreatment. The VisNIR spectra of soil samples were used to predict total petroleum hydrocarbon (TPH) content using partial least squares (PLS) regression and boosted regression tree (BRT) models. The field-moist intact scan proved best for predicting TPH content with a validation r2 of 0.64 and relative percent difference (RPD) of 1.70. Those 46 samples were used to calibrate a penalized spline (PS) model. Subsequently, the PS model was used to predict soil TPH content for 128 soil samples collected over an 80 ha study site. An exponential semivariogram using PS predictions revealed strong spatial dependence among soil TPH [r2 = 0.76, range = 52 m, nugget = 0.001 (log10 mg kg-1)2, and sill 1.044 (log10 mg kg-1)2]. An ordinary block kriging map produced from the data showed that TPH distribution matched the expected TPH variability of the study site. Another study used DRS to measure reflectance patterns of 68 artificially constructed samples with different clay content, organic carbon levels, petroleum types, and different levels of contamination per type. Both first derivative of reflectance and discrete wavelet transformations were used to preprocess the spectra. Principal component analysis (PCA) was applied for qualitative VisNIR discrimination of variable soil types, organic carbon levels, petroleum types, and concentration levels. Soil types were separated with 100% accuracy, and organic carbon levels were separated with 96% accuracy by linear discriminant analysis. The support vector machine produced 82% classification accuracy for organic carbon levels by repeated random splitting of the whole dataset. However, spectral absorptions for each petroleum hydrocarbon overlapped with each other and could not be separated with any classification scheme when contaminations were mixed. Wavelet-based multiple linear regression performed best for predicting petroleum amount with the highest residual prediction deviation (RPD) of 3.97. While using the first derivative of reflectance spectra, PS regression performed better (RPD = 3.3) than the PLS (RPD= 2.5) model. Specific calibrations considering additional soil physicochemical variability are recommended to produce improved predictions.



Document Availability at the Time of Submission

Release the entire work immediately for access worldwide.

Committee Chair

Weindorf, David C.