A Machine Learning Framework for Detecting COVID-19 Infection Using Surface-Enhanced Raman Scattering
Abstract
In this study, we explored machine learning approaches for predictive diagnosis using surface-enhanced Raman scattering (SERS), applied to the detection of COVID-19 infection in biological samples. To do this, we utilized SERS data collected from 20 patients at the University of Maryland Baltimore School of Medicine. As a preprocessing step, the positive-negative labels are obtained using Polymerase Chain Reaction (PCR) testing. First, we compared the performance of linear and nonlinear dimensionality techniques for projecting the high-dimensional Raman spectra to a low-dimensional space where a smaller number of variables defines each sample. The appropriate number of reduced features used was obtained by comparing the mean accuracy from a 10-fold cross-validation. Finally, we employed Gaussian process (GP) classification, a probabilistic machine learning approach, to correctly predict the occurrence of a negative or positive sample as a function of the low-dimensional space variables. As opposed to providing rigid class labels, the GP classifier provides a probability (ranging from zero to one) that a given sample is positive or negative. In practice, the proposed framework can be used to provide high-throughput rapid testing, and a follow-up PCR can be used for confirmation in cases where the model's uncertainty is unacceptably high.