A Machine Learning Framework for Detecting COVID-19 Infection Using Surface-Enhanced Raman Scattering

Eloghosa Ikponmwoba, Department of Mechanical and Industrial Engineering, Louisiana State University, Baton Rouge, LA 70803, USA.
Okezzi Ukorigho, Department of Mechanical and Industrial Engineering, Louisiana State University, Baton Rouge, LA 70803, USA.
Parikshit Moitra, Department of Pediatrics, Center for Blood Oxygen Transport and Hemostasis, University of Maryland Baltimore School of Medicine, Baltimore, MD 21201, USA.
Dipanjan Pan, Department of Pediatrics, Center for Blood Oxygen Transport and Hemostasis, University of Maryland Baltimore School of Medicine, Baltimore, MD 21201, USA.
Manas Ranjan Gartia, Department of Mechanical and Industrial Engineering, Louisiana State University, Baton Rouge, LA 70803, USA.
Opeoluwa Owoyele, Department of Mechanical and Industrial Engineering, Louisiana State University, Baton Rouge, LA 70803, USA.

Abstract

In this study, we explored machine learning approaches for predictive diagnosis using surface-enhanced Raman scattering (SERS), applied to the detection of COVID-19 infection in biological samples. To do this, we utilized SERS data collected from 20 patients at the University of Maryland Baltimore School of Medicine. As a preprocessing step, the positive-negative labels are obtained using Polymerase Chain Reaction (PCR) testing. First, we compared the performance of linear and nonlinear dimensionality techniques for projecting the high-dimensional Raman spectra to a low-dimensional space where a smaller number of variables defines each sample. The appropriate number of reduced features used was obtained by comparing the mean accuracy from a 10-fold cross-validation. Finally, we employed Gaussian process (GP) classification, a probabilistic machine learning approach, to correctly predict the occurrence of a negative or positive sample as a function of the low-dimensional space variables. As opposed to providing rigid class labels, the GP classifier provides a probability (ranging from zero to one) that a given sample is positive or negative. In practice, the proposed framework can be used to provide high-throughput rapid testing, and a follow-up PCR can be used for confirmation in cases where the model's uncertainty is unacceptably high.