Doctor of Philosophy (PhD)
Electrical and Computer Engineering
High speed videoendoscopy (HSV) of the larynx far surpasses the limits of videostroboscopy in evaluating the vocal fold vibratory behavior by providing much higher frame rate. HSV enables the visualization of vocal fold vibratory pattern within an actual glottic cycle. This very detailed infor-mation on vocal fold vibratory characteristics could provide valuable information for the assessment of vocal fold vibratory function in disordered voices and the treatments effects of the behavioral, medical and surgical treatment procedures. In this work, we aim at addressing the problem of classi-fying voice disorders with varying etiology by following four steps described shortly. Our method-ology starts with glottis segmentation. Given a HSV data, the contour of the glottal opening area in each frame should be acquired. These contours record the vibration track of the vocal fold. After this, we obtain a reliable glottal axis that is necessary for getting certain vibratory features. The third step is the feature extraction on HSV data. In the last step, we complete the classification based on the features obtained from step 3. In this study, we first propose a novel glottis segmentation method based on simplified dynam-ic programming, which proves to be efficient and accurate. In addition, we introduce a new ap-proach for calculating the glottal axis. By comparing the proposed glottal axis determination meth-ods (modified linear regression) against state-of-the-art techniques, we demonstrate that our tech-nique is more reliable. After that, the concentration shifts to feature extraction and classification schemes. Eighteen different features are extracted and their discrimination is evaluated based on principal component analysis. Support vector machine and neural network are implemented to achieve the classification among three different types of vocal folds(normal vocal fold, unilateral vocal fold polyp, and unilateral vocal fold paralysis). The result demonstrates that the classification rates of four different tasks are all above 80%.
Document Availability at the Time of Submission
Release the entire work immediately for access worldwide.
Chen, Jing, "Vocal Fold Analysis From High Speed Videoendoscopic Data" (2014). LSU Doctoral Dissertations. 664.