Machine-learning-assisted spontaneous Raman spectroscopy classification and feature extraction for the diagnosis of human laryngeal cancer
Abstract
The early detection of laryngeal cancer significantly increases the survival rates, permits more conservative larynx sparing treatments, and reduces healthcare costs. A non-invasive optical form of biopsy for laryngeal carcinoma can increase the early detection rate, allow for more accurate monitoring of its recurrence, and improve intraoperative margin control. In this study, we evaluated a Raman spectroscopy system for the rapid intraoperative detection of human laryngeal carcinoma. The spectral analysis methods included principal component analysis (PCA), random forest (RF), and one-dimensional (1D) convolutional neural network (CNN) methods. We measured the Raman spectra from 207 normal and 500 tumor sites collected from 10 human laryngeal cancer surgical specimens. Random Forest analysis yielded an overall accuracy of 90.5%, sensitivity of 88.2%, and specificity of 92.8% on average over 10 trials. The 1D CNN demonstrated the highest performance with an accuracy of 96.1%, sensitivity of 95.2%, and specificity of 96.9% on average over 50 trials. In predicting the first three principal components (PCs) of normal and tumor data, both RF and CNN demonstrated high performances, except for the tumor PC2. This is the first study in which CNN-assisted Raman spectroscopy was used to identify human laryngeal cancer tissue with extracted feature weights. The proposed Raman spectroscopy feature extraction approach has not been previously applied to human cancer diagnosis. Raman spectroscopy, as assisted by machine learning (ML) methods, has the potential to serve as an intraoperative, non-invasive tool for the rapid diagnosis of laryngeal cancer and margin detection.